# Overview - May 2025
I need to update the overview. Note that BMF variable names are different now that I am using (for the first time) the NCCS's unified BMF file. See correspondence with Jesse.

#### Update of this notebook:
- *IRS Form 990 e-File Data (7b) -- Merge BMF Data into 990 Data and Limit to 501(c)(3) orgs.ipynb*

# Overview - old
In this notebook I merge the BMF data into the 990 data and then limit the dataset to 501(c)(3) orgs

Read in files:
- *all filings nov 2021 - all control variables (with parsed sub-key variables and reformatted types and fillnull).pkl.gz*
- *BMF Data for 2,383,390 EINs (2010-2021 -- most recent entry per org).pkl.gz*
    - Note that this BMF file contains the most recent entry for *all* BMF EINs (not just the ones in the e-file data)
        - This is a new way of doing it

Merge files and save:
- *990 and BMF control variables.pkl.gz*

Limit DF to 501(c)(3) nonprofits:
- I have 2,016,624 990 e-filings downloaded for 345,743 unique EINs. Of these, the  e-filing variable *501c3* shows that 1,546,373 of these are filings by 501(c)(3) organizations. However, in previous runs I haven't used this variable. Instead, I merged in BMF data from NCCS and selected 501c3s on their *SUBSECCD* variable -- based partly on the logic that they have at least minimally 'cleaned' the data.
- Comparing the two variables shows a couple of issues:
    - There are 236 filings -- for 87 EINs -- where the e-filing has the org coded as a 501(c)(3) but the BMF data has it coded as something else.
    - There are 2,401 filings -- and 730 unique EINs -- where the e-file-based variable 501c3 is coded as not being a 501(c)(3) but the BMF variable has it coded as a 501(c)(3).
- In this notebook you'll some further verifications and there don't seem to be any systematic errors: Sometimes the BMF data is off, sometimes the e-file data is off, sometimes the BMF data is just missing (I have the BMF data for 341,983 of the 345,743 EINs in the e-file data), and sometimes the designation has changed over time so the BMF data just reflects the current designation (the BMF data is, of course, only for the most recent year so some organizations can change their code).
- My thoughts are it is best to select all EINs where every year of e-file data indicates it is a 501(c)(3) -- and essentially ignore the BMF data when it comes to determining what is and isn't a 501(c)(3). This removes ambiguous cases and those that may have experienced a section code change.
- With this proposed solution, the final dataset has 1,544,009 filings, 2,364 fewer than the original 1,546,373 where *501c3*==1. In terms of EINs, that's a drop of 631 EINs -- from 265,528 down to 264,897. Of course, those 631 could be verified but with such a large dataset I don't think it is worth it and some of those 631 would have a section code change.


Saved file:
- *990 and BMF control variables for 264,897 501c3 orgs (N=1,544,009).pkl.gz*

In [1]:
import numpy as np
import pandas as pd
from pandas import DataFrame
from pandas import Series

In [2]:
print(pd.__version__)

2.2.2


In [3]:
#http://pandas.pydata.org/pandas-docs/stable/options.html
pd.set_option('display.max_columns', None)
pd.set_option('max_colwidth', 250)

In [4]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
warnings.simplefilter(action='ignore', category=pd.errors.PerformanceWarning)

In [5]:
import datetime
import gc

#### Set working directory

In [6]:
cd "C:\\Users\\Gregory\\IRS 990 Control Variables\\"

C:\Users\Gregory\IRS 990 Control Variables


# Read in DF

In [13]:
#%%time
#import datetime
#print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
#df = pd.read_pickle('all NEW filings February 2024 - all control variables (with parsed sub-key variables and reformatted types and fillnull).pkl.gz',
#            compression='gzip')
#print('# of columns:', len(df.columns))
#print('# of observations:', len(df))
#df[:1]

Current date and time :  2024-03-31 01:05:36 

# of columns: 299
# of observations: 891980
CPU times: total: 24.9 s
Wall time: 25 s


Unnamed: 0,URL,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,BusinessName,BusinessNameControlTxt,PhoneNum,USAddress,InCareOfNm,ForeignAddress,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int
0,https://s3.amazonaws.com/irs-form-990/201812509349300101_public.xml,0.0,2022-09-23 18:48:47+00:00,2018,346526754,{'BusinessNameLine1Txt': 'Lucas County Farm Bureau'},LUCA,4198338015,"{'AddressLine1Txt': '109 Portage St', 'CityNm': 'Woodville', 'StateAbbreviationCd': 'OH', 'ZIPCd': '43469'}",,,,0.0,0.0,,0.0,5.0,0.0,0.0,,272756,0,0.0,0.0,KAYLA RICHARDS,2018-08-29,,OH,2017-08-01,2018-07-31,2017,2018-09-07 04:44:38-07:00,0.0,1.0,0.0,,0.0,,1916.0,239263.0,236036,278582.0,0.0,10,15945.0,383254.0,94080.0,29098.0,0,0.0,0.0,3331,-30456.0,0.0,424855,354081.0,0,7,39.0,38270,323625.0,0,0.0,10,181041,0,10719,386585,IMPROVE RURAL STANDARD OF LIVING.,65279,26001,0,23105,20738.0,424800.0,269425,41546.0,272756,0.0,0.0,,BENEFITS PAID TO OR FOR MEMBERS - THIS IS PAID MEMBERSHIPS TO OHIO FARM BUREAU AND TO AMERICAN FARM BUREAU TO FUTHER THEIR EFFORTS IN PROGRAMMING AND PROMOTING THE FARMING COMMUNITY.,0.0,0.0,0.0,,MEMBERSHIP - COSTS OF PROMOTING FARM BUREAU AND ITS MISSION. PROMOTION OF FARM BUREAU PROGRAMS AND EVENTS IN ORDER TO EDUCATE THE FARMER AND CONSUMER IN CURRENT FARMING AND FOOD ISSUES.,0.0,0.0,0.0,,"CONFERENCE, CONVENTIONS AND MEETINGS - EDUCATION OF VOLUNTEERS FOR THE PROMOTING AND MARKETING OF FARM ISSUES AND CURRENT EVENTS.",0.0,0.0,0.0,0.0,0.0,0.0,0.0,Improve rural standard of living.,0.0,0,0,0,0.0,0,0,0.0,0,0.0,0,0,0.0,0.0,0.0,7,0,0,1.0,1,1.0,1.0,0,1,0,0,0,1,0,1,0.0,1.0,0,0.0,0,0,1,1.0,1,1.0,10,10,0,1.0,0.0,0.0,0.0,OH,1,0.0,0,0,0.0,0.0,0.0,0,0.0,2130.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,236036.0,0.0,0.0,0.0,236036.0,26001.0,0.0,272756,0.0,181041.0,7018.0,0.0,6770.0,0.0,6770.0,0.0,0.0,0.0,0.0,697.0,0.0,0.0,3478.0,2283.0,42350,0.0,0.0,0.0,331.0,0.0,0.0,0.0,0.0,0.0,0.0,1860.0,1860.0,2352.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21245.0,21245.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,269425,0.0,17563.0,251862.0,6278.0,733.0,0.0,1197.0,109784.0,0.0,0.0,0.0,0.0,11076.0,0.0,0.0,27194.0,0.0,0.0,0.0,0.0,74474.0,72904.0,0.0,0.0,0.0,233959.0,165116.0,55332.0,0.0,0.0,1.0,0.0,386585.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,424855,0.0,0.0,0.0,0.0,3331,0.0,0.0,0,1.0,0.0,,0.0,0.0,0.0,0,0.0,,0,109 Portage St,0.0,Woodville,43469,,OH,2018,346526754


In [6]:
%%time
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
df = pd.read_feather('D:/all_filings_april_2025_all_controls_combined_parsed_type_fillnull.feather')
print('# of columns:', len(df.columns))
print('# of observations:', len(df))
df[:1]

Current date and time :  2025-04-19 23:35:19 

# of columns: 307
# of observations: 3469008
CPU times: total: 1min 10s
Wall time: 48 s


Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int
0,5d019e6778ffca27b42818d7,RONALD MCDONALD HOUSE CHARITIES- PHILADELPHIA REGION INC,https://s3.amazonaws.com/irs-form-990/201113139349301301_public.xml,93493313013011,201012,0,2016-02-24 21:20:13+00:00,,232705170,"{'BusinessNameLine1': 'RONALD MCDONALD HOUSE CHARITIES-', 'BusinessNameLine2': 'PHILADELPHIA REGION INC'}",RONA,8565826843,"{'AddressLine1': '1525 VALLEY CENTER PARKWAY NO 300', 'AddressLine1Txt': None, 'AddressLine2': None, 'AddressLine2Txt': None, 'City': 'BETHLEHEM', 'CityNm': None, 'State': 'PA', 'StateAbbreviationCd': None, 'ZIPCd': None, 'ZIPCode': '18017'}",,,,,,,,1,0,,0,,1,0,,1473903,0,0,0,MICHAEL ANTON,2011-11-04,,PA,2010-01-01,2010-12-31,2010,2011-11-09 12:41:09+00:00,0,1,0,,0,,1992,0,1439340,1044925,638637,10,30447,1753405,243131,0,0,0,0,89152,193604,0,2440859,881768,195892,0,0,450430,1075372,0,0,10,0,925000,33563,1990429,MAKES GRANTS TO NON-PROFITS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN.,459751,1000,0,0,0,1925215,1384751,171810,1473903,0,0,,"RMHC OF THE PHILADELPHIA REGION, INC. GRANTS HUNDREDS OF THOUSANDS OF DOLLARS PER YEAR TO SUPPORT NON-PROFIT PROGRAMS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN. LOCALLY, RMHC SUPPORTS THE PHILADELPHIA, SOUTHERN NEW JERSEY AND DE...",1043744,925000,0,,,0,0,0,,,0,0,0,0,0,0,1043744,"THE CORPORATION IS ORGANIZED AND WILL BE OPERATED EXCLUSIVELY FOR CHARITABLE, EDUCATIONAL AND SCIENTIFIC PURPOSES WITHIN THE MEANING OF SECTION 501(C)(3) OF THE INTERNAL REVENUE CODE. SUCH PURPOSES SHALL BE LIMITED TO PROVIDING SUPPORT AND FUNDIN...",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,10,10,0,0,0,0,0,"[""PA"", ""NJ"", ""DE""]",0,0,0,0,1,0,0,0,0,0,0,0,1439340,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1439340,1000,0,1473903,0,0,0,0,86228,0,86228,0,33000,892000,0,0,0,0,0,123,763,0,0,0,0,0,0,0,0,0,0,0,0,21675,0,215,0,0,0,0,0,0,0,0,0,0,0,118744,0,0,0,0,0,0,0,0,1384751,195892,145115,1043744,147981,0,0,0,170617,0,0,0,0,44353,166000,0,0,0,0,1990429,0,0,0,0,1851561,0,0,256845,86228,0,1,0,240077,0,332660,270700,0,0,0,0,0,2440859,0,0,0,0,89152,0,1,0,1,0,,1,0,0,1,1,,1,1525 VALLEY CENTER PARKWAY NO 300,,BETHLEHEM,18017,,PA,2011,232705170


In [9]:
len(set(df['EIN'].tolist()))

456945

# Read in BMF File

In [5]:
#%%time
#bmf = pd.read_pickle('BMF Data for 321,292 EINs.pkl')
#bmf = pd.read_pickle('BMF Data for 333,303 EINs.pkl')
#print('# of columns:', len(bmf.columns))
#print('# of observations:', len(bmf))
#bmf[:1]

# of columns: 41
# of observations: 333303
Wall time: 1.39 s


Unnamed: 0,EIN,SEC_NAME,FRCD,SUBSECCD,TAXPER,ASSETS,INCOME,NAME,ADDRESS,CITY,STATE,NTEEFINAL,NAICS,ZIP5,OUTNCCS,OUTREAS,RULEDATE,FIPS,FNDNCD,PMSA,MSA_NECH,CASSETS,CFINSRC,CTAXPER,CTOTREV,ACCPER,RANDNUM,NTEECC,NTEE1,LEVEL4,LEVEL1,NTMAJ10,MAJGRPB,LEVEL3,LEVEL2,NTMAJ12,NTMAJ5,FILER,ZFILER,EIN9,NTEECONF
24,10018922,178 DANIEL E LAMBERT MEMORIAL,10.0,19,201904.0,231924.0,287169.0,AMERICAN LEGION,5 VERTI DRIVE SUITE B,WINSLOW,ME,W30,813410.0,4901.0,IN,,194610.0,23013,0.0,,,194504.0,16eofinextract990.dat,201605.0,211590.0,4.0,0.038356,W30,W,W,O,PU,W,PB,O,PU,OT,Y,N,10018922,


In [10]:
#%%time
#import datetime
#print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
#bmf = pd.read_pickle('BMF Data for 2,594,652 EINs (2010-2022 -- most recent entry per org).pkl.gz', compression='gzip')
#print('# of columns:', len(bmf.columns))
#print('# of observations:', len(bmf))
#bmf[:1]

Current date and time :  2024-03-30 23:48:09 

# of columns: 41
# of observations: 2594652
CPU times: total: 9.48 s
Wall time: 9.98 s


Unnamed: 0,EIN,SEC_NAME,FRCD,SUBSECCD,TAXPER,ASSETS,INCOME,NAME,ADDRESS,CITY,STATE,NTEEFINAL,NAICS,ZIP5,RULEDATE,FIPS,FNDNCD,PMSA,MSA_NECH,CASSETS,CFINSRC,CTAXPER,CTOTREV,ACCPER,RANDNUM,NTEECC,NTEE1,LEVEL4,LEVEL1,NTMAJ10,MAJGRPB,LEVEL3,LEVEL2,NTMAJ12,NTMAJ5,FILER,ZFILER,OUTREAS,OUTNCCS,EIN9,NTEECONF
0,4101,,1.0,3,,,,SOUTH LAFOURCHE QUARTERBACK CLUB,167 BENT CYPRESS LN,LOCKPORT,LA,N65,,70374.0,202005.0,,,,,,,,,,0.498058,N65,N,N,PC,HU,N,HS,O,HU,HU,N,N,,IN,4101,


#### Read in NCCS UNIFIED file

In [10]:
%%time
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
bmf = pd.read_csv('BMF_UNIFIED_V1.1.csv', dtype=str, low_memory=False)#, encoding='utf-8')
print('# of columns:', len(bmf.columns))
print('# of observations:', len(bmf))
bmf[:2]

Current date and time :  2025-04-20 22:29:38 

# of columns: 49
# of observations: 3462997
CPU times: total: 1min 2s
Wall time: 1min 4s


Unnamed: 0,EIN2,EIN,NTEE_IRS,NTEE_NCCS,NTEEV2,NCCS_LEVEL_1,NCCS_LEVEL_2,NCCS_LEVEL_3,F990_TOTAL_REVENUE_RECENT,F990_TOTAL_INCOME_RECENT,F990_TOTAL_ASSETS_RECENT,F990_ORG_ADDR_CITY,F990_ORG_ADDR_STATE,F990_ORG_ADDR_ZIP,F990_ORG_ADDR_STREET,CENSUS_CBSA_FIPS,CENSUS_CBSA_NAME,CENSUS_BLOCK_FIPS,CENSUS_URBAN_AREA,CENSUS_STATE_ABBR,CENSUS_COUNTY_NAME,ORG_ADDR_FULL,ORG_ADDR_MATCH,LATITUDE,LONGITUDE,GEOCODER_SCORE,GEOCODER_MATCH,BMF_SUBSECTION_CODE,BMF_STATUS_CODE,BMF_PF_FILING_REQ_CODE,BMF_ORGANIZATION_CODE,BMF_INCOME_CODE,BMF_GROUP_EXEMPT_NUM,BMF_FOUNDATION_CODE,BMF_FILING_REQ_CODE,BMF_DEDUCTIBILITY_CODE,BMF_CLASSIFICATION_CODE,BMF_ASSET_CODE,BMF_AFFILIATION_CODE,ORG_RULING_DATE,ORG_FISCAL_YEAR,ORG_RULING_YEAR,ORG_YEAR_FIRST,ORG_YEAR_LAST,ORG_YEAR_COUNT,ORG_PERS_ICO,ORG_NAME_SEC,ORG_NAME_CURRENT,ORG_FISCAL_PERIOD
0,EIN-00-0000000,0,Z99,,,501CX NONPROFIT,O,UN,,0,0,,,,,,,,,,,",,",,0.0,0.0,0.0,U,,,,,,,,,,,,,,,,1996,1996,1,,,,6
1,EIN-00-0000001,1,B43,B43,UNI-B43-RG,501C3 CHARITY,O,ED,,0,0,LOUISVILLE,KY,63130.0,570 SOUTH FOURTH STREET NO 100,31140.0,"Louisville/Jefferson County, KY-IN",211110049002000.0,U,KY,Jefferson County,"570 SOUTH FOURTH STREET NO 100,LOUISVILLE,KY,63130","570 S 4th St, #100, Louisville, Kentucky, 40202",38.2491390380287,-85.7578707324714,99.76,M,3.0,,,,,,0.0,10.0,,,,,,,,2017,2017,1,,VOAKY HOPEFUL ROAD INC,VOLUNTEERS OF AMERICA INC,6


In [8]:
print(bmf.columns)

Index(['EIN2', 'EIN', 'NTEE_IRS', 'NTEE_NCCS', 'NTEEV2', 'NCCS_LEVEL_1',
       'NCCS_LEVEL_2', 'NCCS_LEVEL_3', 'F990_TOTAL_REVENUE_RECENT',
       'F990_TOTAL_INCOME_RECENT', 'F990_TOTAL_ASSETS_RECENT',
       'F990_ORG_ADDR_CITY', 'F990_ORG_ADDR_STATE', 'F990_ORG_ADDR_ZIP',
       'F990_ORG_ADDR_STREET', 'CENSUS_CBSA_FIPS', 'CENSUS_CBSA_NAME',
       'CENSUS_BLOCK_FIPS', 'CENSUS_URBAN_AREA', 'CENSUS_STATE_ABBR',
       'CENSUS_COUNTY_NAME', 'ORG_ADDR_FULL', 'ORG_ADDR_MATCH', 'LATITUDE',
       'LONGITUDE', 'GEOCODER_SCORE', 'GEOCODER_MATCH', 'BMF_SUBSECTION_CODE',
       'BMF_STATUS_CODE', 'BMF_PF_FILING_REQ_CODE', 'BMF_ORGANIZATION_CODE',
       'BMF_INCOME_CODE', 'BMF_GROUP_EXEMPT_NUM', 'BMF_FOUNDATION_CODE',
       'BMF_FILING_REQ_CODE', 'BMF_DEDUCTIBILITY_CODE',
       'BMF_CLASSIFICATION_CODE', 'BMF_ASSET_CODE', 'BMF_AFFILIATION_CODE',
       'ORG_RULING_DATE', 'ORG_FISCAL_YEAR', 'ORG_RULING_YEAR',
       'ORG_YEAR_FIRST', 'ORG_YEAR_LAST', 'ORG_YEAR_COUNT', 'ORG_PERS_ICO',
     

In [14]:
print(bmf['EIN'].str.len().value_counts())
#print(bmf['EIN2'].str.len().value_counts())
print(bmf['CENSUS_CBSA_FIPS'].str.len().value_counts())
print(bmf['CENSUS_BLOCK_FIPS'].str.len().value_counts())
print(bmf['GEOCODER_SCORE'].str.len().value_counts())
print(bmf['GEOCODER_MATCH'].str.len().value_counts())

EIN
9    3335896
8     125678
7       1334
6         59
1         19
5          7
4          4
Name: count, dtype: int64
CENSUS_CBSA_FIPS
5.0    3216294
Name: count, dtype: int64
CENSUS_BLOCK_FIPS
15.0    2840782
14.0     615375
Name: count, dtype: int64
GEOCODER_SCORE
3.0    2595980
2.0     584257
5.0     225669
4.0      50834
1.0       6256
Name: count, dtype: int64
GEOCODER_MATCH
1.0    3462996
Name: count, dtype: int64


In [15]:
census_cols = ['CENSUS_CBSA_FIPS', 'CENSUS_BLOCK_FIPS', 'GEOCODER_SCORE', 'GEOCODER_MATCH']
bmf[census_cols].sample(10)

Unnamed: 0,CENSUS_CBSA_FIPS,CENSUS_BLOCK_FIPS,GEOCODER_SCORE,GEOCODER_MATCH
882988,31080,60371288012000,98.9,M
2090149,33100,120860001421004,100.0,M
2994699,19660,121270826042046,100.0,M
2987607,17140,390610242001001,100.0,M
1204088,26420,482014540002000,97.72,M
552385,47900,110019800001085,86.0,M
2311964,30020,400310024051122,98.0,M
393527,37980,421010073003004,100.0,M
1147948,16980,170318019011013,100.0,M
3172543,49420,530770005004006,100.0,M


In [11]:
[c for c in bmf.columns if 'ein' in c.lower()]

['EIN2', 'EIN']

In [12]:
bmf[['EIN2', 'EIN']].sample(5)

Unnamed: 0,EIN2,EIN
1736171,EIN-47-4542227,474542227.0
558208,EIN-23-7216435,237216435.0
2257872,EIN-68-0524786,680524786.0
2921676,EIN-85-2958316,852958316.0
2975943,EIN-86-2050388,862050388.0


In [None]:
bmf2025['EIN_original'] = bmf2025['EIN']
bmf2025[['EIN2', 'EIN', 'EIN_original']].sample(5)

In [13]:
%%time
bmf['EIN'] = bmf['EIN'].fillna(0).astype(int).astype(str).str.zfill(9)
bmf[['EIN2', 'EIN']].sample(5)

CPU times: total: 1.95 s
Wall time: 2.05 s


Unnamed: 0,EIN2,EIN
829989,EIN-27-0645197,270645197
3341984,EIN-94-2609016,942609016
277436,EIN-20-1718908,201718908
1285417,EIN-38-3305177,383305177
3226608,EIN-92-0298583,920298583


In [15]:
bmf[['EIN2', 'EIN']].dtypes

EIN2    object
EIN     object
dtype: object

#### Rename BMF columns

In [17]:
bmf.columns

Index(['EIN2', 'EIN', 'NTEE_IRS', 'NTEE_NCCS', 'NTEEV2', 'NCCS_LEVEL_1',
       'NCCS_LEVEL_2', 'NCCS_LEVEL_3', 'F990_TOTAL_REVENUE_RECENT',
       'F990_TOTAL_INCOME_RECENT', 'F990_TOTAL_ASSETS_RECENT',
       'F990_ORG_ADDR_CITY', 'F990_ORG_ADDR_STATE', 'F990_ORG_ADDR_ZIP',
       'F990_ORG_ADDR_STREET', 'CENSUS_CBSA_FIPS', 'CENSUS_CBSA_NAME',
       'CENSUS_BLOCK_FIPS', 'CENSUS_URBAN_AREA', 'CENSUS_STATE_ABBR',
       'CENSUS_COUNTY_NAME', 'ORG_ADDR_FULL', 'ORG_ADDR_MATCH', 'LATITUDE',
       'LONGITUDE', 'GEOCODER_SCORE', 'GEOCODER_MATCH', 'BMF_SUBSECTION_CODE',
       'BMF_STATUS_CODE', 'BMF_PF_FILING_REQ_CODE', 'BMF_ORGANIZATION_CODE',
       'BMF_INCOME_CODE', 'BMF_GROUP_EXEMPT_NUM', 'BMF_FOUNDATION_CODE',
       'BMF_FILING_REQ_CODE', 'BMF_DEDUCTIBILITY_CODE',
       'BMF_CLASSIFICATION_CODE', 'BMF_ASSET_CODE', 'BMF_AFFILIATION_CODE',
       'ORG_RULING_DATE', 'ORG_FISCAL_YEAR', 'ORG_RULING_YEAR',
       'ORG_YEAR_FIRST', 'ORG_YEAR_LAST', 'ORG_YEAR_COUNT', 'ORG_PERS_ICO',
     

In [18]:
bmf_cols = ['BMF_'+c for c in bmf.columns.tolist()]
print(len(bmf.columns))
print(len(bmf_cols))
print(bmf_cols)

49
49
['BMF_EIN2', 'BMF_EIN', 'BMF_NTEE_IRS', 'BMF_NTEE_NCCS', 'BMF_NTEEV2', 'BMF_NCCS_LEVEL_1', 'BMF_NCCS_LEVEL_2', 'BMF_NCCS_LEVEL_3', 'BMF_F990_TOTAL_REVENUE_RECENT', 'BMF_F990_TOTAL_INCOME_RECENT', 'BMF_F990_TOTAL_ASSETS_RECENT', 'BMF_F990_ORG_ADDR_CITY', 'BMF_F990_ORG_ADDR_STATE', 'BMF_F990_ORG_ADDR_ZIP', 'BMF_F990_ORG_ADDR_STREET', 'BMF_CENSUS_CBSA_FIPS', 'BMF_CENSUS_CBSA_NAME', 'BMF_CENSUS_BLOCK_FIPS', 'BMF_CENSUS_URBAN_AREA', 'BMF_CENSUS_STATE_ABBR', 'BMF_CENSUS_COUNTY_NAME', 'BMF_ORG_ADDR_FULL', 'BMF_ORG_ADDR_MATCH', 'BMF_LATITUDE', 'BMF_LONGITUDE', 'BMF_GEOCODER_SCORE', 'BMF_GEOCODER_MATCH', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_STATUS_CODE', 'BMF_BMF_PF_FILING_REQ_CODE', 'BMF_BMF_ORGANIZATION_CODE', 'BMF_BMF_INCOME_CODE', 'BMF_BMF_GROUP_EXEMPT_NUM', 'BMF_BMF_FOUNDATION_CODE', 'BMF_BMF_FILING_REQ_CODE', 'BMF_BMF_DEDUCTIBILITY_CODE', 'BMF_BMF_CLASSIFICATION_CODE', 'BMF_BMF_ASSET_CODE', 'BMF_BMF_AFFILIATION_CODE', 'BMF_ORG_RULING_DATE', 'BMF_ORG_FISCAL_YEAR', 'BMF_ORG_RULING_YEAR

In [19]:
bmf[:1]

Unnamed: 0,EIN2,EIN,NTEE_IRS,NTEE_NCCS,NTEEV2,NCCS_LEVEL_1,NCCS_LEVEL_2,NCCS_LEVEL_3,F990_TOTAL_REVENUE_RECENT,F990_TOTAL_INCOME_RECENT,F990_TOTAL_ASSETS_RECENT,F990_ORG_ADDR_CITY,F990_ORG_ADDR_STATE,F990_ORG_ADDR_ZIP,F990_ORG_ADDR_STREET,CENSUS_CBSA_FIPS,CENSUS_CBSA_NAME,CENSUS_BLOCK_FIPS,CENSUS_URBAN_AREA,CENSUS_STATE_ABBR,CENSUS_COUNTY_NAME,ORG_ADDR_FULL,ORG_ADDR_MATCH,LATITUDE,LONGITUDE,GEOCODER_SCORE,GEOCODER_MATCH,BMF_SUBSECTION_CODE,BMF_STATUS_CODE,BMF_PF_FILING_REQ_CODE,BMF_ORGANIZATION_CODE,BMF_INCOME_CODE,BMF_GROUP_EXEMPT_NUM,BMF_FOUNDATION_CODE,BMF_FILING_REQ_CODE,BMF_DEDUCTIBILITY_CODE,BMF_CLASSIFICATION_CODE,BMF_ASSET_CODE,BMF_AFFILIATION_CODE,ORG_RULING_DATE,ORG_FISCAL_YEAR,ORG_RULING_YEAR,ORG_YEAR_FIRST,ORG_YEAR_LAST,ORG_YEAR_COUNT,ORG_PERS_ICO,ORG_NAME_SEC,ORG_NAME_CURRENT,ORG_FISCAL_PERIOD
0,EIN-00-0000000,0,Z99,,,501CX NONPROFIT,O,UN,,0.0,0.0,,,,,,,,,,,",,",,0.0,0.0,0.0,U,,,,,,,,,,,,,,,,1996,1996,1,,,,6


In [20]:
bmf.columns = bmf_cols
bmf[:1]

Unnamed: 0,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD
0,EIN-00-0000000,0,Z99,,,501CX NONPROFIT,O,UN,,0.0,0.0,,,,,,,,,,,",,",,0.0,0.0,0.0,U,,,,,,,,,,,,,,,,1996,1996,1,,,,6


# Merge 990 and BMF Data

In [21]:
df[[c for c in df.columns.tolist() if 'EIN' in c]][:1]

Unnamed: 0,EIN
0,232705170


In [22]:
bmf[[c for c in bmf.columns.tolist() if 'EIN' in c]][:1]

Unnamed: 0,BMF_EIN2,BMF_EIN
0,EIN-00-0000000,0


In [24]:
bmf['BMF_EIN'].apply(lambda x: len(str(x))).value_counts()

BMF_EIN
9    3462997
Name: count, dtype: int64

In [25]:
df['EIN'].apply(lambda x: len(str(x))).value_counts()

EIN
9    3469008
Name: count, dtype: int64

<br>Test the merge

In [28]:
%%time
pd.merge(df, bmf, left_on='EIN', right_on='BMF_EIN', how='left', indicator=True)['_merge'].value_counts()

CPU times: total: 49.1 s
Wall time: 50.7 s


_merge
both          3444255
left_only       27516
right_only          0
Name: count, dtype: int64

In [30]:
3444255/(3444255+27516)

0.9920743620474968

#### Old way

In [21]:
#%%time
#pd.merge(df, bmf, left_on='EIN', right_on='BMF_EIN9', how='left', indicator=True)['_merge'].value_counts()

CPU times: total: 14.8 s
Wall time: 15.3 s


both          877302
left_only      14678
right_only         0
Name: _merge, dtype: int64

In [None]:
877302/(877302+14678)

<br>Merge the datasets

In [31]:
%%time
print(len(df))
print(len(pd.merge(df, bmf, left_on='EIN', right_on='BMF_EIN', how='left', indicator=True)))
merged = pd.merge(df, bmf, left_on='EIN', right_on='BMF_EIN', how='left', indicator=True)
print(len(merged))
print(merged['_merge'].value_counts())
#merged = merged.drop('_merge', 1)
merged[:1]

3469008
3471771
3471771
_merge
both          3444255
left_only       27516
right_only          0
Name: count, dtype: int64
CPU times: total: 1min 33s
Wall time: 1min 36s


Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD,_merge
0,5d019e6778ffca27b42818d7,RONALD MCDONALD HOUSE CHARITIES- PHILADELPHIA REGION INC,https://s3.amazonaws.com/irs-form-990/201113139349301301_public.xml,93493313013011,201012,0,2016-02-24 21:20:13+00:00,,232705170,"{'BusinessNameLine1': 'RONALD MCDONALD HOUSE CHARITIES-', 'BusinessNameLine2': 'PHILADELPHIA REGION INC'}",RONA,8565826843,"{'AddressLine1': '1525 VALLEY CENTER PARKWAY NO 300', 'AddressLine1Txt': None, 'AddressLine2': None, 'AddressLine2Txt': None, 'City': 'BETHLEHEM', 'CityNm': None, 'State': 'PA', 'StateAbbreviationCd': None, 'ZIPCd': None, 'ZIPCode': '18017'}",,,,,,,,1,0,,0,,1,0,,1473903,0,0,0,MICHAEL ANTON,2011-11-04,,PA,2010-01-01,2010-12-31,2010,2011-11-09 12:41:09+00:00,0,1,0,,0,,1992,0,1439340,1044925,638637,10,30447,1753405,243131,0,0,0,0,89152,193604,0,2440859,881768,195892,0,0,450430,1075372,0,0,10,0,925000,33563,1990429,MAKES GRANTS TO NON-PROFITS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN.,459751,1000,0,0,0,1925215,1384751,171810,1473903,0,0,,"RMHC OF THE PHILADELPHIA REGION, INC. GRANTS HUNDREDS OF THOUSANDS OF DOLLARS PER YEAR TO SUPPORT NON-PROFIT PROGRAMS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN. LOCALLY, RMHC SUPPORTS THE PHILADELPHIA, SOUTHERN NEW JERSEY AND DE...",1043744,925000,0,,,0,0,0,,,0,0,0,0,0,0,1043744,"THE CORPORATION IS ORGANIZED AND WILL BE OPERATED EXCLUSIVELY FOR CHARITABLE, EDUCATIONAL AND SCIENTIFIC PURPOSES WITHIN THE MEANING OF SECTION 501(C)(3) OF THE INTERNAL REVENUE CODE. SUCH PURPOSES SHALL BE LIMITED TO PROVIDING SUPPORT AND FUNDIN...",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,10,10,0,0,0,0,0,"[""PA"", ""NJ"", ""DE""]",0,0,0,0,1,0,0,0,0,0,0,0,1439340,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1439340,1000,0,1473903,0,0,0,0,86228,0,86228,0,33000,892000,0,0,0,0,0,123,763,0,0,0,0,0,0,0,0,0,0,0,0,21675,0,215,0,0,0,0,0,0,0,0,0,0,0,118744,0,0,0,0,0,0,0,0,1384751,195892,145115,1043744,147981,0,0,0,170617,0,0,0,0,44353,166000,0,0,0,0,1990429,0,0,0,0,1851561,0,0,256845,86228,0,1,0,240077,0,332660,270700,0,0,0,0,0,2440859,0,0,0,0,89152,0,1,0,1,0,,1,0,0,1,1,,1,1525 VALLEY CENTER PARKWAY NO 300,,BETHLEHEM,18017,,PA,2011,232705170,EIN-23-2705170,232705170,E86,E86,HEL-E86-RG,501C3 CHARITY,O,HE,,3141381.0,1408486.0,PHILADELPHIA,PA,19104,3925 CHESTNUT ST,37980.0,"Philadelphia-Camden-Wilmington, PA-NJ-DE-MD",421010100000000.0,U,PA,Philadelphia County,"3925 CHESTNUT ST,PHILADELPHIA,PA,19104","3925 Chestnut St, Philadelphia, Pennsylvania, 19104",39.955542,-75.201247,100.0,M,3.0,,,,,0.0,15.0,10.0,,,,,2007-12,2019.0,2007.0,1995.0,2019.0,25.0,,PHILADELPHIA REGION INC,RONALD MCDONALD HOUSE CHARITIES INC,12.0,both


In [32]:
merged = merged.drop('_merge', axis=1)

In [33]:
print(len(df.columns))
print(len(bmf.columns))
print(len(merged.columns))

307
49
356


In [34]:
print('# of unique orgs:', len(set(df['EIN'].tolist())))
print('# of unique filings:', len(set(df['URL'].tolist())))
print('# of obs in DF:', len(df))

# of unique orgs: 456945
# of unique filings: 3469008
# of obs in DF: 3469008


In [35]:
print('# of unique orgs:', len(set(bmf['BMF_EIN'].tolist())))
print('# of obs in DF:', len(bmf))

# of unique orgs: 3436970
# of obs in DF: 3462997


In [37]:
print(len(set(merged[merged['BMF_EIN'].notnull()]['BMF_EIN'].tolist())))
print(len(set(merged[merged['BMF_EIN'].isnull()]['BMF_EIN'].tolist())))
print(len(set(merged[merged['BMF_EIN'].isnull()]['EIN'].tolist())))

448845
1
8100


In [38]:
merged[['EIN']+bmf_cols].sample(10)

Unnamed: 0,EIN,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD
1099289,43134710,EIN-04-3134710,43134710,E99Z,,,501C3 CHARITY,O,UN,1041678.0,1041678.0,1221352.0,MATTAPOISETT,MA,02739-0000,CO DR ZARINS 5 SHIPYARD LANE,14460.0,"Boston-Cambridge-Newton, MA-NH",250235600000000.0,U,MA,Plymouth County,"CO DR ZARINS 5 SHIPYARD LANE,MATTAPOISETT,MA,02739-0000","5 Ship Yard Lane, Mattapoisett, Massachusetts, 02739",41.657517,-70.804723,96.83,M,3.0,1.0,0.0,1.0,6.0,0.0,15.0,1.0,1.0,1000.0,6.0,3.0,1992-02,2024.0,1992.0,1995.0,2024.0,30.0,% LATVIAN MEDICAL FOUNDATION,,LATVIAN MEDICAL FOUNDATION,2.0
1940089,450432901,EIN-45-0432901,450432901,G960,,,501C3 CHARITY,O,UN,332323.0,332323.0,397918.0,ELMHURST,IL,60126-2895,110 E SCHILLER ST STE 230,16980.0,"Chicago-Naperville-Elgin, IL-IN-WI",170438400000000.0,U,IL,DuPage County,"110 E SCHILLER ST STE 230,ELMHURST,IL,60126-2895","110 E Schiller St, Ste 230, Elmhurst, Illinois, 60126",41.901059,-87.939588,100.0,M,3.0,1.0,0.0,1.0,4.0,0.0,16.0,1.0,1.0,2000.0,4.0,3.0,1994-03,2024.0,1994.0,1995.0,2024.0,30.0,,,AMERICAN SOCIETY OF NEUROPHYSIOLOGICAL MONITORING,3.0
1542208,231370483,EIN-23-1370483,231370483,X20,X20,REL-X20-RG,501C3 CHARITY,S,RE,382766.0,382766.0,27943.0,READING,PA,19610-2110,1534 CLEVELAND AVE,39740.0,"Reading, PA",420110100000000.0,U,PA,Berks County,"1534 CLEVELAND AVE,READING,PA,19610-2110","1534 Cleveland Ave, Reading, Pennsylvania, 19610",40.328006,-75.970863,100.0,M,3.0,1.0,0.0,5.0,4.0,0.0,17.0,1.0,1.0,7000.0,3.0,3.0,1948-10,2024.0,1948.0,1989.0,2024.0,36.0,,,READING BERKS CONFERENCE OF CHURCHES,10.0
2575148,540795710,EIN-54-0795710,540795710,K28,K28,HMS-K28-RG,501CX NONPROFIT,O,HS,77475.0,92409.0,511397.0,CHARLOTTE C H,VA,23923-0338,PO BOX 338,,,510379300000000.0,R,VA,Charlotte County,"PO BOX 338,CHARLOTTE C H,VA,23923-0338","23923-0338, Charlotte Court House, Virginia",37.09962,-78.65118,97.9,M,5.0,1.0,0.0,5.0,3.0,0.0,0.0,1.0,2.0,1000.0,5.0,3.0,1967-03,2024.0,1967.0,1995.0,2024.0,30.0,,,CHARLOTTE COUNTY FARM BUREAU,3.0
1661884,943205455,EIN-94-3205455,943205455,T99,T99,PSB-T99-RG,501C3 CHARITY,S,ZF,6554595.0,6554595.0,2560274.0,SAN FRANCISCO,CA,94107-3016,1060 TENNESSEE STREET FLOOR 2,41860.0,"San Francisco-Oakland-Berkeley, CA",60750230000000.0,U,CA,San Francisco County,"1060 TENNESSEE STREET FLOOR 2,SAN FRANCISCO,CA,94107-3016","1060 Tennessee St, San Francisco, California, 94107",37.758582,-122.38935,100.0,M,3.0,1.0,0.0,1.0,7.0,0.0,15.0,1.0,1.0,1000.0,6.0,3.0,1999-03,2024.0,1999.0,1995.0,2024.0,30.0,% TONY PARKER,,PARTNERS IN SCHOOL INNOVATION,3.0
2142474,590751935,EIN-59-0751935,590751935,P27,P27,HMS-P27-RG,501C3 CHARITY,O,HS,3381616.0,3399861.0,3453462.0,WEST PALM BEACH,FL,33401-3304,1016 N DIXIE HIGHWAY,33100.0,"Miami-Fort Lauderdale-Pompano Beach, FL",120990000000000.0,U,FL,Palm Beach County,"1016 N DIXIE HIGHWAY,WEST PALM BEACH,FL,33401-3304","1016 N Dixie Hwy, West Palm Beach, Florida, 33401",26.722293,-80.053254,100.0,M,3.0,1.0,0.0,5.0,6.0,0.0,16.0,1.0,1.0,1200.0,6.0,3.0,1943-01,2024.0,1943.0,1989.0,2024.0,36.0,,,YWCA OF PALM BEACH COUNTY INC,1.0
3450717,61615351,EIN-06-1615351,61615351,C34,C34,ENV-C34-RG,501C3 CHARITY,O,EN,57287.0,144378.0,2525742.0,BOLTON,CT,06043-7659,59 MAPLE VALLEY RD,25540.0,"Hartford-East Hartford-Middletown, CT",90135290000000.0,R,CT,Capitol Planning Region,"59 MAPLE VALLEY RD,BOLTON,CT,06043-7659","59 Maple Valley Rd, Bolton, Connecticut, 06043",41.777202,-72.427643,100.0,M,3.0,1.0,0.0,1.0,4.0,0.0,15.0,1.0,1.0,1000.0,6.0,3.0,2001-12,2024.0,2001.0,2002.0,2024.0,23.0,% GWEN E MARRION,,BOLTON LAND TRUST INC,12.0
2097773,421276632,EIN-42-1276632,421276632,E24,E24,HOS-E24-RG,501C3 CHARITY,O,HE,10407519.0,10463519.0,22981124.0,CEDAR RAPIDS,IA,52402-3160,4080 1ST AVE NE,16300.0,"Cedar Rapids, IA",191130000000000.0,U,IA,Linn County,"4080 1ST AVE NE,CEDAR RAPIDS,IA,52402-3160","4080 1st Ave NE, Cedar Rapids, Iowa, 52402",42.020658,-91.632524,100.0,M,3.0,1.0,0.0,1.0,8.0,0.0,16.0,1.0,1.0,1000.0,8.0,3.0,1986-10,2024.0,1986.0,1989.0,2024.0,36.0,% HEALTHCARE OF IOWA,,STL CARE COMPANY,10.0
23171,233048845,EIN-23-3048845,233048845,T50,T50,PSB-T50-RG,501C3 CHARITY,O,PB,601104.0,601104.0,9941149.0,PHILADELPHIA,PA,19141-2710,1315 WINDRIM AVE,37980.0,"Philadelphia-Camden-Wilmington, PA-NJ-DE-MD",421010300000000.0,U,PA,Philadelphia County,"1315 WINDRIM AVE,PHILADELPHIA,PA,19141-2710","1315 Windrim Ave, Philadelphia, Pennsylvania, 19141",40.0306,-75.144779,100.0,M,3.0,1.0,0.0,1.0,5.0,0.0,15.0,1.0,1.0,1200.0,7.0,3.0,2001-05,2024.0,2001.0,2001.0,2024.0,24.0,% JOHN LOCKARD,,WES CORPORATION,5.0
3376409,471566544,EIN-47-1566544,471566544,I73,I73,HMS-I73-RG,501C3 CHARITY,O,HS,326693.0,326693.0,202120.0,FAIRHOPE,AL,36533-1032,PO BOX 1032,19300.0,"Daphne-Fairhope-Foley, AL",10030110000000.0,U,AL,Baldwin County,"PO BOX 1032,FAIRHOPE,AL,36533-1032","36533-1032, Fairhope, Alabama",30.523122,-87.900358,98.0,M,3.0,1.0,0.0,1.0,4.0,0.0,15.0,1.0,1.0,1000.0,4.0,3.0,2015-05,2024.0,2015.0,2015.0,2024.0,10.0,,,ROADS OF HOPE INC,5.0


In [None]:
#LIKELY NEW 501c3 COLUMNS
['BMF_NCCS_LEVEL_1', 'BMF_BMF_SUBSECTION_CODE']

#### Save DF

In [37]:
#%%time
#import datetime
#print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
#merged.to_pickle('990 and BMF control variables.pkl.gz', compression='gzip')

Wall time: 30min 25s


# Compare 501(c)(3) variables
This section contains the verifications summarized in the first 'overview' cell in this notebook. The code in this section is not cleaned up so can be skipped if desired. The final few cells in the notebook contain the 'solution' to the problem of which orgs to consider '501(c)(3)s' and then save a version of the DF with the valid 501(c)(3) orgs. 

In [39]:
merged[:1]

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD
0,5d019e6778ffca27b42818d7,RONALD MCDONALD HOUSE CHARITIES- PHILADELPHIA REGION INC,https://s3.amazonaws.com/irs-form-990/201113139349301301_public.xml,93493313013011,201012,0,2016-02-24 21:20:13+00:00,,232705170,"{'BusinessNameLine1': 'RONALD MCDONALD HOUSE CHARITIES-', 'BusinessNameLine2': 'PHILADELPHIA REGION INC'}",RONA,8565826843,"{'AddressLine1': '1525 VALLEY CENTER PARKWAY NO 300', 'AddressLine1Txt': None, 'AddressLine2': None, 'AddressLine2Txt': None, 'City': 'BETHLEHEM', 'CityNm': None, 'State': 'PA', 'StateAbbreviationCd': None, 'ZIPCd': None, 'ZIPCode': '18017'}",,,,,,,,1,0,,0,,1,0,,1473903,0,0,0,MICHAEL ANTON,2011-11-04,,PA,2010-01-01,2010-12-31,2010,2011-11-09 12:41:09+00:00,0,1,0,,0,,1992,0,1439340,1044925,638637,10,30447,1753405,243131,0,0,0,0,89152,193604,0,2440859,881768,195892,0,0,450430,1075372,0,0,10,0,925000,33563,1990429,MAKES GRANTS TO NON-PROFITS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN.,459751,1000,0,0,0,1925215,1384751,171810,1473903,0,0,,"RMHC OF THE PHILADELPHIA REGION, INC. GRANTS HUNDREDS OF THOUSANDS OF DOLLARS PER YEAR TO SUPPORT NON-PROFIT PROGRAMS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN. LOCALLY, RMHC SUPPORTS THE PHILADELPHIA, SOUTHERN NEW JERSEY AND DE...",1043744,925000,0,,,0,0,0,,,0,0,0,0,0,0,1043744,"THE CORPORATION IS ORGANIZED AND WILL BE OPERATED EXCLUSIVELY FOR CHARITABLE, EDUCATIONAL AND SCIENTIFIC PURPOSES WITHIN THE MEANING OF SECTION 501(C)(3) OF THE INTERNAL REVENUE CODE. SUCH PURPOSES SHALL BE LIMITED TO PROVIDING SUPPORT AND FUNDIN...",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,10,10,0,0,0,0,0,"[""PA"", ""NJ"", ""DE""]",0,0,0,0,1,0,0,0,0,0,0,0,1439340,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1439340,1000,0,1473903,0,0,0,0,86228,0,86228,0,33000,892000,0,0,0,0,0,123,763,0,0,0,0,0,0,0,0,0,0,0,0,21675,0,215,0,0,0,0,0,0,0,0,0,0,0,118744,0,0,0,0,0,0,0,0,1384751,195892,145115,1043744,147981,0,0,0,170617,0,0,0,0,44353,166000,0,0,0,0,1990429,0,0,0,0,1851561,0,0,256845,86228,0,1,0,240077,0,332660,270700,0,0,0,0,0,2440859,0,0,0,0,89152,0,1,0,1,0,,1,0,0,1,1,,1,1525 VALLEY CENTER PARKWAY NO 300,,BETHLEHEM,18017,,PA,2011,232705170,EIN-23-2705170,232705170,E86,E86,HEL-E86-RG,501C3 CHARITY,O,HE,,3141381.0,1408486.0,PHILADELPHIA,PA,19104,3925 CHESTNUT ST,37980.0,"Philadelphia-Camden-Wilmington, PA-NJ-DE-MD",421010100000000.0,U,PA,Philadelphia County,"3925 CHESTNUT ST,PHILADELPHIA,PA,19104","3925 Chestnut St, Philadelphia, Pennsylvania, 19104",39.955542,-75.201247,100.0,M,3.0,,,,,0.0,15.0,10.0,,,,,2007-12,2019.0,2007.0,1995.0,2019.0,25.0,,PHILADELPHIA REGION INC,RONALD MCDONALD HOUSE CHARITIES INC,12.0


In [40]:
[c for c in merged.columns.tolist() if '501' in c]

['F9_00_HD_EXEMPT_STATUS_501C', 'F9_00_HD_EXEMPT_STATUS_501C3', '501c3']

In [43]:
merged[['501c3', 'BMF_NCCS_LEVEL_1', 'BMF_BMF_SUBSECTION_CODE']].sample(10)

Unnamed: 0,501c3,BMF_NCCS_LEVEL_1,BMF_BMF_SUBSECTION_CODE
2675377,1,501C3 CHARITY,3.0
1425913,1,501C3 CHARITY,3.0
309364,1,501C3 CHARITY,3.0
2552158,1,501C3 CHARITY,3.0
2423511,0,501CX NONPROFIT,7.0
2554857,1,501C3 CHARITY,3.0
2301273,1,501C3 CHARITY,3.0
3111861,1,501C3 CHARITY,3.0
2416494,1,501C3 CHARITY,3.0
3455390,1,501C3 CHARITY,3.0


In [44]:
print(len(merged[merged['BMF_BMF_SUBSECTION_CODE'].isnull()]))
print(len(merged[merged['501c3'].isnull()]))

27550
0


In [46]:
len(df)

3469008

In [45]:
print(merged['BMF_BMF_SUBSECTION_CODE'].value_counts(), '\n')
print(merged['501c3'].value_counts())

BMF_BMF_SUBSECTION_CODE
3.0     2640582
6.0      226688
4.0      115446
5.0      104216
7.0       96271
9.0       54704
8.0       42043
19.0      40471
12.0      32827
14.0      27154
2.0       21376
13.0      20215
10.0      12494
25.0       4473
91.0       1937
15.0       1347
17.0        620
1.0         541
92.0        297
29.0        116
16.0         96
26.0         91
11.0         68
50.0         40
27.0         36
23.0         28
71.0         14
18.0         13
20.0         12
21.0          4
0.0           1
Name: count, dtype: int64 

501c3
1    2639594
0     832177
Name: count, dtype: int64


In [47]:
len(bmf)

3462997

In [48]:
pd.crosstab(merged['501c3'], merged['BMF_BMF_SUBSECTION_CODE']==3)

BMF_BMF_SUBSECTION_CODE,False,True
501c3,Unnamed: 1_level_1,Unnamed: 2_level_1
0,826437,5740
1,4752,2634842


In [50]:
pd.crosstab(merged['501c3'], merged['BMF_NCCS_LEVEL_1']=='501C3 CHARITY')

BMF_NCCS_LEVEL_1,False,True
501c3,Unnamed: 1_level_1,Unnamed: 2_level_1
0,826861,5316
1,13928,2625666


In [51]:
pd.crosstab(merged['BMF_BMF_SUBSECTION_CODE']==3, merged['BMF_NCCS_LEVEL_1']=='501C3 CHARITY')

BMF_NCCS_LEVEL_1,False,True
BMF_BMF_SUBSECTION_CODE,Unnamed: 1_level_1,Unnamed: 2_level_1
False,831173,16
True,9616,2630966


In [52]:
pd.crosstab(merged['BMF_BMF_SUBSECTION_CODE']==3, merged['BMF_NCCS_LEVEL_1'])

BMF_NCCS_LEVEL_1,501C3 CHARITY,501C3 PRIVATE FOUNDATION,501CX NONPROFIT,UNDEFINED
BMF_BMF_SUBSECTION_CODE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
False,16,282,803101,274
True,2630966,9216,213,187


<br>There are 4,752 filings -- and 2,832 unique EINs -- where the e-file-based variable *501c3* is coded as being a 501(c)(3) but the BMF variable is either missing or some other number besides '3'.

In 2024 the numbers were `5,074` and `3,821`, respectively.

In [53]:
print(len(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE']!=3)]))
print(len(set(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE']!=3)]['EIN'].tolist())))

4752
2832


In [54]:
gc.collect()

598

# ENDED HERE 4/19/2025

<br>For most of these, the *BMF_SUBSECCD* value is missing

In [55]:
print(len(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE'].isnull())]))
print(len(set(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE'].isnull())]['EIN'].tolist())))

4329
2635


<br>There are 423 filings -- for 197 EINs -- where the e-filing has the org coded as a 501(c)(3) but the BMF data has it coded as something else.

In [56]:
print(list(set(merged[merged['BMF_BMF_SUBSECTION_CODE'].notnull()]['BMF_BMF_SUBSECTION_CODE'].tolist())))

[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 23.0, 25.0, 26.0, 27.0, 29.0, 50.0, 71.0, 91.0, 92.0]


In [57]:
non_501c3 = [1.0, 2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 23.0, 25.0, 26.0, 27.0, 29.0, 50.0, 71.0, 82.0, 91.0, 92.0]
print(len(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE'].isin(non_501c3))]))
print(len(set(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE'].isin(non_501c3))]['EIN'].tolist())))

423
197


In [58]:
EIN_filing501c3_BMFnot_list = list(set(merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE'].isin(non_501c3))]['EIN'].tolist()))
print(len(EIN_filing501c3_BMFnot_list))
print(EIN_filing501c3_BMFnot_list[:5])

197
['263765844', '570641979', '396076509', '410919107', '593444871']


<br>This one seems to be a 501(c)(3). The IRS website's 990 forms for the org all have the 501(c)(3) box checked.

In [63]:
df[:1]

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int
0,5d019e6778ffca27b42818d7,RONALD MCDONALD HOUSE CHARITIES- PHILADELPHIA REGION INC,https://s3.amazonaws.com/irs-form-990/201113139349301301_public.xml,93493313013011,201012,0,2016-02-24 21:20:13+00:00,,232705170,"{'BusinessNameLine1': 'RONALD MCDONALD HOUSE CHARITIES-', 'BusinessNameLine2': 'PHILADELPHIA REGION INC'}",RONA,8565826843,"{'AddressLine1': '1525 VALLEY CENTER PARKWAY NO 300', 'AddressLine1Txt': None, 'AddressLine2': None, 'AddressLine2Txt': None, 'City': 'BETHLEHEM', 'CityNm': None, 'State': 'PA', 'StateAbbreviationCd': None, 'ZIPCd': None, 'ZIPCode': '18017'}",,,,,,,,1,0,,0,,1,0,,1473903,0,0,0,MICHAEL ANTON,2011-11-04,,PA,2010-01-01,2010-12-31,2010,2011-11-09 12:41:09+00:00,0,1,0,,0,,1992,0,1439340,1044925,638637,10,30447,1753405,243131,0,0,0,0,89152,193604,0,2440859,881768,195892,0,0,450430,1075372,0,0,10,0,925000,33563,1990429,MAKES GRANTS TO NON-PROFITS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN.,459751,1000,0,0,0,1925215,1384751,171810,1473903,0,0,,"RMHC OF THE PHILADELPHIA REGION, INC. GRANTS HUNDREDS OF THOUSANDS OF DOLLARS PER YEAR TO SUPPORT NON-PROFIT PROGRAMS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN. LOCALLY, RMHC SUPPORTS THE PHILADELPHIA, SOUTHERN NEW JERSEY AND DE...",1043744,925000,0,,,0,0,0,,,0,0,0,0,0,0,1043744,"THE CORPORATION IS ORGANIZED AND WILL BE OPERATED EXCLUSIVELY FOR CHARITABLE, EDUCATIONAL AND SCIENTIFIC PURPOSES WITHIN THE MEANING OF SECTION 501(C)(3) OF THE INTERNAL REVENUE CODE. SUCH PURPOSES SHALL BE LIMITED TO PROVIDING SUPPORT AND FUNDIN...",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,10,10,0,0,0,0,0,"[""PA"", ""NJ"", ""DE""]",0,0,0,0,1,0,0,0,0,0,0,0,1439340,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1439340,1000,0,1473903,0,0,0,0,86228,0,86228,0,33000,892000,0,0,0,0,0,123,763,0,0,0,0,0,0,0,0,0,0,0,0,21675,0,215,0,0,0,0,0,0,0,0,0,0,0,118744,0,0,0,0,0,0,0,0,1384751,195892,145115,1043744,147981,0,0,0,170617,0,0,0,0,44353,166000,0,0,0,0,1990429,0,0,0,0,1851561,0,0,256845,86228,0,1,0,240077,0,332660,270700,0,0,0,0,0,2440859,0,0,0,0,89152,0,1,0,1,0,,1,0,0,1,1,,1,1525 VALLEY CENTER PARKWAY NO 300,,BETHLEHEM,18017,,PA,2011,232705170


In [64]:
[c for c in merged.columns if 'tax' in c.lower()]

['TaxPeriod',
 'F9_00_HD_TAX_PER_BEGIN',
 'F9_00_HD_TAX_PER_END',
 'F9_00_HD_TAX_YEAR',
 'F9_09_PC_PAYROLL_TAX_FUNDRAISE',
 'F9_09_PC_PAYROLL_TAX_MGMT',
 'F9_09_PC_PAYROLL_TAX_PROG_SVCE',
 'F9_09_PC_PAYROLL_TAX_TOTAL']

In [68]:
print(len(merged[merged['EIN'].isin(EIN_filing501c3_BMFnot_list)]))
merged[merged['EIN'].isin(['061555321'])][['501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE',
                                           'F9_00_HD_TAX_YEAR']][:20]

1270


Unnamed: 0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,F9_00_HD_TAX_YEAR
15,1,71.0,17.0,2010
118345,1,71.0,17.0,2011
555775,1,71.0,17.0,2013
623442,1,71.0,17.0,2012
847644,1,71.0,17.0,2014
1024310,1,71.0,17.0,2015
1237081,1,71.0,17.0,2016
1542673,1,71.0,17.0,2017
1758008,1,71.0,17.0,2018
1985073,1,71.0,17.0,2019


In [45]:
print(len(merged[merged['EIN'].isin(EIN_filing501c3_BMFnot_list)]))
merged[merged['EIN'].isin(['061555321'])][['501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE',]][:20]

332


Unnamed: 0,501c3,BMF_SUBSECCD,BMF_FNDNCD
100956,1,71.0,17.0
459490,1,71.0,17.0
827939,1,71.0,17.0


<br>For the variable *BMF_FNDNCD*, values of 0 are non-501(c)(3), or "All organizations except 501(c)(3)" https://nccs-data.urban.org/dd2.php?close=1&form=BMF+08/2016
Seems to be the same for the updated variable name `BMF_BMF_FOUNDATION_CODE`

In [70]:
EIN_filing501c3_BMFnot_df = merged[merged['EIN'].isin(EIN_filing501c3_BMFnot_list)]
print(len(EIN_filing501c3_BMFnot_df))
EIN_filing501c3_BMFnot_df[['501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE']][:10]

1270


Unnamed: 0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE
15,1,71.0,17.0
7082,0,6.0,0.0
9550,0,6.0,0.0
14537,1,4.0,0.0
17631,0,13.0,0.0
19377,1,4.0,0.0
21383,1,50.0,0.0
32246,1,5.0,15.0
35331,0,4.0,0.0
35867,0,6.0,0.0


<br>Value '17' for *BMF_FNDNCD* indicates "Supporting Organization 509(a)(3) for benefit and in conjunction with organization(s) coded 10-16"

In [72]:
[c for c in merged.columns if 'BMF_' in c]

['BMF_EIN2',
 'BMF_EIN',
 'BMF_NTEE_IRS',
 'BMF_NTEE_NCCS',
 'BMF_NTEEV2',
 'BMF_NCCS_LEVEL_1',
 'BMF_NCCS_LEVEL_2',
 'BMF_NCCS_LEVEL_3',
 'BMF_F990_TOTAL_REVENUE_RECENT',
 'BMF_F990_TOTAL_INCOME_RECENT',
 'BMF_F990_TOTAL_ASSETS_RECENT',
 'BMF_F990_ORG_ADDR_CITY',
 'BMF_F990_ORG_ADDR_STATE',
 'BMF_F990_ORG_ADDR_ZIP',
 'BMF_F990_ORG_ADDR_STREET',
 'BMF_CENSUS_CBSA_FIPS',
 'BMF_CENSUS_CBSA_NAME',
 'BMF_CENSUS_BLOCK_FIPS',
 'BMF_CENSUS_URBAN_AREA',
 'BMF_CENSUS_STATE_ABBR',
 'BMF_CENSUS_COUNTY_NAME',
 'BMF_ORG_ADDR_FULL',
 'BMF_ORG_ADDR_MATCH',
 'BMF_LATITUDE',
 'BMF_LONGITUDE',
 'BMF_GEOCODER_SCORE',
 'BMF_GEOCODER_MATCH',
 'BMF_BMF_SUBSECTION_CODE',
 'BMF_BMF_STATUS_CODE',
 'BMF_BMF_PF_FILING_REQ_CODE',
 'BMF_BMF_ORGANIZATION_CODE',
 'BMF_BMF_INCOME_CODE',
 'BMF_BMF_GROUP_EXEMPT_NUM',
 'BMF_BMF_FOUNDATION_CODE',
 'BMF_BMF_FILING_REQ_CODE',
 'BMF_BMF_DEDUCTIBILITY_CODE',
 'BMF_BMF_CLASSIFICATION_CODE',
 'BMF_BMF_ASSET_CODE',
 'BMF_BMF_AFFILIATION_CODE',
 'BMF_ORG_RULING_DATE',
 'BMF_ORG_

In [73]:
EIN_filing501c3_BMFnot_df[EIN_filing501c3_BMFnot_df['EIN']=='061555321'][['501c3', 'BMF_BMF_SUBSECTION_CODE', 
                                                                          'BMF_BMF_FOUNDATION_CODE', 'BMF_ORG_RULING_DATE']]

Unnamed: 0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,BMF_ORG_RULING_DATE
15,1,71.0,17.0,2000-12
118345,1,71.0,17.0,2000-12
555775,1,71.0,17.0,2000-12
623442,1,71.0,17.0,2000-12
847644,1,71.0,17.0,2000-12
1024310,1,71.0,17.0,2000-12
1237081,1,71.0,17.0,2000-12
1542673,1,71.0,17.0,2000-12
1758008,1,71.0,17.0,2000-12
1985073,1,71.0,17.0,2000-12


In [74]:
EIN_filing501c3_BMFnot_df[EIN_filing501c3_BMFnot_df['EIN']=='061555321'][['501c3', 'BMF_BMF_SUBSECTION_CODE', 
                                                'BMF_BMF_FOUNDATION_CODE', 'BMF_ORG_RULING_DATE', 'F9_00_HD_TAX_YEAR']]

Unnamed: 0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,BMF_ORG_RULING_DATE,F9_00_HD_TAX_YEAR
15,1,71.0,17.0,2000-12,2010
118345,1,71.0,17.0,2000-12,2011
555775,1,71.0,17.0,2000-12,2013
623442,1,71.0,17.0,2000-12,2012
847644,1,71.0,17.0,2000-12,2014
1024310,1,71.0,17.0,2000-12,2015
1237081,1,71.0,17.0,2000-12,2016
1542673,1,71.0,17.0,2000-12,2017
1758008,1,71.0,17.0,2000-12,2018
1985073,1,71.0,17.0,2000-12,2019


In [75]:
EIN_filing501c3_BMFnot_df_grouped = EIN_filing501c3_BMFnot_df[['EIN', '501c3', 'BMF_BMF_SUBSECTION_CODE', 
                                                               'BMF_BMF_FOUNDATION_CODE']].groupby('EIN').nunique()
EIN_filing501c3_BMFnot_df_grouped.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
501c3,197.0,1.730964,0.444588,1.0,1.0,2.0,2.0,2.0
BMF_BMF_SUBSECTION_CODE,197.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0
BMF_BMF_FOUNDATION_CODE,197.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0


In [48]:
EIN_filing501c3_BMFnot_df_grouped = EIN_filing501c3_BMFnot_df[['EIN', '501c3', 'BMF_BMF_SUBSECTION_CODE', 
                                                               'BMF_BMF_FOUNDATION_CODE']].groupby('EIN').nunique()
EIN_filing501c3_BMFnot_df_grouped.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
501c3,139.0,1.582734,0.494891,1.0,1.0,2.0,2.0,2.0
BMF_SUBSECCD,139.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0
BMF_FNDNCD,139.0,0.964029,0.186892,0.0,1.0,1.0,1.0,1.0


Unnamed: 0,EIN,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,F9_00_HD_TAX_YEAR
2820978,10428565,0,13.0,0.0,2022
3227890,10428565,1,13.0,0.0,2023


In [78]:
EIN_filing501c3_BMFnot_df_grouped[:5]

Unnamed: 0_level_0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE
EIN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10366464,1,1,1
10428565,2,1,1
10597921,2,1,1
10732928,2,1,1
20227638,1,1,1


In [83]:
merged[merged['EIN']=='010366464'][['EIN', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE', 'F9_00_HD_TAX_YEAR']]

Unnamed: 0,EIN,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,F9_00_HD_TAX_YEAR
2229472,10366464,1,50.0,0.0,2020


In [82]:
merged[merged['EIN']=='010428565'][['EIN', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE', 'F9_00_HD_TAX_YEAR']]

Unnamed: 0,EIN,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,F9_00_HD_TAX_YEAR
2820978,10428565,0,13.0,0.0,2022
3227890,10428565,1,13.0,0.0,2023


In [84]:
merged[merged['EIN']=='010597921'][['EIN', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE', 'F9_00_HD_TAX_YEAR']]

Unnamed: 0,EIN,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,F9_00_HD_TAX_YEAR
897443,10597921,0,19.0,0.0,2015
1247607,10597921,0,19.0,0.0,2016
1394830,10597921,0,19.0,0.0,2017
1704295,10597921,0,19.0,0.0,2018
2077983,10597921,0,19.0,0.0,2019
2323326,10597921,0,19.0,0.0,2021
2538251,10597921,0,19.0,0.0,2020
2938229,10597921,0,19.0,0.0,2022
3179128,10597921,0,19.0,0.0,2023
3372606,10597921,1,19.0,0.0,2023


In [85]:
EIN_filing501c3_BMFnot_df_grouped[EIN_filing501c3_BMFnot_df_grouped['501c3']!=1]

Unnamed: 0_level_0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE
EIN,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
010428565,2,1,1
010597921,2,1,1
010732928,2,1,1
046190001,2,1,1
112912570,2,1,1
...,...,...,...
946398562,2,1,1
950751442,2,1,1
951038185,2,1,1
956346441,2,1,1


In [86]:
EIN_filing501c3_BMFnot_df_grouped = EIN_filing501c3_BMFnot_df_grouped.reset_index()
EIN_filing501c3_BMFnot_df_grouped[:1]

Unnamed: 0,EIN,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE
0,10366464,1,1,1


In [87]:
#EIN_filing501c3_BMFnot_df_grouped = EIN_filing501c3_BMFnot_df_grouped.drop('EIN', 1)
#EIN_filing501c3_BMFnot_df_grouped = EIN_filing501c3_BMFnot_df_grouped.reset_index()
check_eins = EIN_filing501c3_BMFnot_df_grouped[EIN_filing501c3_BMFnot_df_grouped['501c3']!=1]['EIN'].tolist()
print(len(check_eins))

144


In [90]:
print(len(merged[merged['F9_00_HD_TAX_YEAR'].isnull()]))
print(len(merged[merged['EIN'].isnull()]))

0
0


In [89]:
merged = merged.sort_values(['EIN', 'F9_00_HD_TAX_YEAR'])

<br>The first organization below (EIN = '161771606') changed to a 501(c)(6) in its 2017 filing. That is confirmed by looking at its letter of determination. A previous letter had the 501(c)(3) determination so the org changed its designation.

In [92]:
print(len(merged[merged['EIN'].isin(check_eins)]))
merged[merged['EIN'].isin(check_eins)][['EIN', 'F9_00_HD_TAX_YEAR', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE',
                                        'BMF_ORG_RULING_DATE']][18:25]

1115


Unnamed: 0,EIN,F9_00_HD_TAX_YEAR,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE,BMF_ORG_RULING_DATE
14537,10732928,2009,1,4.0,0.0,2004-12
151873,10732928,2010,1,4.0,0.0,2004-12
340781,10732928,2011,1,4.0,0.0,2004-12
565670,10732928,2012,1,4.0,0.0,2004-12
594925,46190001,2013,0,19.0,0.0,1946-03
686424,46190001,2014,0,19.0,0.0,1946-03
897819,46190001,2015,0,19.0,0.0,1946-03


In [93]:
merged[:1]

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD
3016720,65c1a1d52a9ba8ce45342904,,https://s3.amazonaws.com/irs-form-990/202323189349305317_public.xml,,,0,2023-04-26 12:10:37+00:00,2022,10017496,,,,"{'AddressLine1': None, 'AddressLine1Txt': 'PO BOX 534', 'AddressLine2': None, 'AddressLine2Txt': None, 'City': None, 'CityNm': 'YORK HARBOR', 'State': None, 'StateAbbreviationCd': 'ME', 'ZIPCd': '03911', 'ZIPCode': None}",,,{'BusinessNameLine1Txt': 'AGAMENTICUS YACHT CLUB INC'},AGAM,2073638510,,,0,0,,0,,1,0,,376800,0,0,0,DANIEL FORD,2023-11-13,,ME,2022-01-01,2022-12-31,2022,2023-11-14 16:30:26+00:00,0,1,0,,0,WWW.AYCSAIL.ORG,1937,0,279970,0,0,13,0,273331,0,0,0,0,0,184620,0,0,413907,0,2744,16,20,8282,0,0,0,13,0,0,34628,405625,"THE ORGANIZATION'S PRIMARY EXEMPT PURPOSE IS TO TEACH SAILING TO CHILDREN BY FOCUSING ON SAFETY, ENJOYMENT AND KNOWLEDGE OF SAILING.",132172,3377,54843,56026,0,273331,188198,0,372818,0,0,,"PROVIDES SAILING INSTRUCTION, SEAMAN-SHIP AND WATER SAFETY SKILLS TO CHILDREN.",167950,0,54843,,,0,0,0,,,0,0,0,0,0,0,167950,"THE ORGANIZATION'S PRIMARY EXEMPT PURPOSE IS TO TEACH YOUNGSTERS THE BASICS OF SAILING, SEAMAN-SHIP AND SAFE CONDUCT ON THE WATER. IT IS THE ORGANIZATION'S MISSION TO CREATE AND SUSTAIN A COMMUNITY OF FAMILIES WHO ENJOY BEING ON THE WATER.",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,2,0,1,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,13,13,0,0,0,0,0,,0,0,0,0,1,0,0,0,0,0,0,0,236965,0,0,0,0,0,0,0,0,0,3377,43005,0,54843,0,279970,0,54843,372818,0,0,0,0,0,16519,16519,0,0,0,84,21088,0,0,2776,1000,28263,0,1,0,0,0,0,0,0,0,0,0,0,6748,0,0,0,0,0,0,0,0,0,0,0,52045,52045,0,0,0,3981,3981,0,0,0,0,188198,2744,17504,167950,0,0,30000,0,174902,0,0,0,0,682,0,0,7600,0,0,0,0,29306,125682,0,0,0,83323,453817,278915,0,0,0,0,0,0,0,0,0,0,0,0,413907,0,0,0,0,184620,-52326,0,0,1,0,,0,0,0,0,0,,1,PO BOX 534,,YORK HARBOR,3911,,ME,2023,10017496,EIN-01-0017496,10017496,N50,N50,HMS-N50-RG,501C3 CHARITY,O,HS,372818.0,376800.0,413907.0,YORK HARBOR,ME,03911-0534,PO BOX 534,38860.0,"Portland-South Portland, ME",230310400000000.0,U,ME,York County,"PO BOX 534,YORK HARBOR,ME,03911-0534","03911-0534, York Harbor, Maine",43.134174,-70.640878,98.0,M,3.0,1.0,0.0,1.0,4.0,0.0,15.0,1.0,1.0,2000.0,4.0,3.0,1993-03,2024.0,1993.0,1995.0,2024.0,30.0,,,AGAMENTICUS YACHT CLUB OF YORK,3.0


In [95]:
merged[:1]

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD
3016720,65c1a1d52a9ba8ce45342904,,https://s3.amazonaws.com/irs-form-990/202323189349305317_public.xml,,,0,2023-04-26 12:10:37+00:00,2022,10017496,,,,"{'AddressLine1': None, 'AddressLine1Txt': 'PO BOX 534', 'AddressLine2': None, 'AddressLine2Txt': None, 'City': None, 'CityNm': 'YORK HARBOR', 'State': None, 'StateAbbreviationCd': 'ME', 'ZIPCd': '03911', 'ZIPCode': None}",,,{'BusinessNameLine1Txt': 'AGAMENTICUS YACHT CLUB INC'},AGAM,2073638510,,,0,0,,0,,1,0,,376800,0,0,0,DANIEL FORD,2023-11-13,,ME,2022-01-01,2022-12-31,2022,2023-11-14 16:30:26+00:00,0,1,0,,0,WWW.AYCSAIL.ORG,1937,0,279970,0,0,13,0,273331,0,0,0,0,0,184620,0,0,413907,0,2744,16,20,8282,0,0,0,13,0,0,34628,405625,"THE ORGANIZATION'S PRIMARY EXEMPT PURPOSE IS TO TEACH SAILING TO CHILDREN BY FOCUSING ON SAFETY, ENJOYMENT AND KNOWLEDGE OF SAILING.",132172,3377,54843,56026,0,273331,188198,0,372818,0,0,,"PROVIDES SAILING INSTRUCTION, SEAMAN-SHIP AND WATER SAFETY SKILLS TO CHILDREN.",167950,0,54843,,,0,0,0,,,0,0,0,0,0,0,167950,"THE ORGANIZATION'S PRIMARY EXEMPT PURPOSE IS TO TEACH YOUNGSTERS THE BASICS OF SAILING, SEAMAN-SHIP AND SAFE CONDUCT ON THE WATER. IT IS THE ORGANIZATION'S MISSION TO CREATE AND SUSTAIN A COMMUNITY OF FAMILIES WHO ENJOY BEING ON THE WATER.",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,2,0,1,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,13,13,0,0,0,0,0,,0,0,0,0,1,0,0,0,0,0,0,0,236965,0,0,0,0,0,0,0,0,0,3377,43005,0,54843,0,279970,0,54843,372818,0,0,0,0,0,16519,16519,0,0,0,84,21088,0,0,2776,1000,28263,0,1,0,0,0,0,0,0,0,0,0,0,6748,0,0,0,0,0,0,0,0,0,0,0,52045,52045,0,0,0,3981,3981,0,0,0,0,188198,2744,17504,167950,0,0,30000,0,174902,0,0,0,0,682,0,0,7600,0,0,0,0,29306,125682,0,0,0,83323,453817,278915,0,0,0,0,0,0,0,0,0,0,0,0,413907,0,0,0,0,184620,-52326,0,0,1,0,,0,0,0,0,0,,1,PO BOX 534,,YORK HARBOR,3911,,ME,2023,10017496,EIN-01-0017496,10017496,N50,N50,HMS-N50-RG,501C3 CHARITY,O,HS,372818.0,376800.0,413907.0,YORK HARBOR,ME,03911-0534,PO BOX 534,38860.0,"Portland-South Portland, ME",230310400000000.0,U,ME,York County,"PO BOX 534,YORK HARBOR,ME,03911-0534","03911-0534, York Harbor, Maine",43.134174,-70.640878,98.0,M,3.0,1.0,0.0,1.0,4.0,0.0,15.0,1.0,1.0,2000.0,4.0,3.0,1993-03,2024.0,1993.0,1995.0,2024.0,30.0,,,AGAMENTICUS YACHT CLUB OF YORK,3.0


In [102]:
merged[(merged['501c3']==1)&(merged['BMF_BMF_SUBSECTION_CODE']!=3)][['501c3', 'BMF_BMF_SUBSECTION_CODE', 
                                                          'BMF_ORG_NAME_CURRENT']].sample(10)

Unnamed: 0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_ORG_NAME_CURRENT
3468653,1,19.0,AMERICAN LEGION
3147445,1,,
3161497,1,,
3020710,1,,
3383186,1,,
3132359,1,,
3146547,1,,
3386037,1,,
3104027,1,6.0,ST LANDRY CHAMNBER OF COMMERCE
3365434,1,,


<br>For 2016 the organization checked '501(c)(3) in its 990. In 2017 they instead filled in 501(c)(8) -- perhaps they changed types?

A 501(c)(8) organizations is a "fraternal beneficiary society, order, or association."

On the IRS website the organization's full name is "Independent Order Of Odd Fellows Gill Terrace Retirement Apartments Ii" so the 501(c)(8) designation fits: https://apps.irs.gov/app/eos/detailsPage?ein=200310256&name=INDEPENDENT%20ORDER%20OF%20ODD%20FELLOWS%20GILL%20TERRACE%20RETIREMENT%20APARTMENTS%20II&city=&state=&countryAbbr=US&dba=&type=COPYOFRETURNS&orgTags=COPYOFRETURNS 

IRS EIN search link: https://apps.irs.gov/app/eos/allSearch

In [104]:
merged[merged['EIN']=='200310256'][['F9_00_HD_TAX_YEAR', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_ORG_NAME_CURRENT']]

Unnamed: 0,F9_00_HD_TAX_YEAR,501c3,BMF_BMF_SUBSECTION_CODE,BMF_ORG_NAME_CURRENT
596123,2013,1,8.0,INDEPENDENT ORDER OF ODD FELLOWS
747326,2014,1,8.0,INDEPENDENT ORDER OF ODD FELLOWS
882846,2015,1,8.0,INDEPENDENT ORDER OF ODD FELLOWS
1096187,2016,1,8.0,INDEPENDENT ORDER OF ODD FELLOWS
1414695,2017,0,8.0,INDEPENDENT ORDER OF ODD FELLOWS
1648495,2018,0,8.0,INDEPENDENT ORDER OF ODD FELLOWS
1936753,2019,0,8.0,INDEPENDENT ORDER OF ODD FELLOWS
2125132,2020,0,8.0,INDEPENDENT ORDER OF ODD FELLOWS
2315651,2021,0,8.0,INDEPENDENT ORDER OF ODD FELLOWS
2942302,2022,0,8.0,INDEPENDENT ORDER OF ODD FELLOWS


<br>There are 5,740 filings -- and 1,324 unique EINs -- where the e-file-based variable *501c3* is coded as being a 501(c)(3) but the BMF variable is either missing or some other number besides '3'.

In [105]:
print(len(merged[(merged['501c3']!=1)&(merged['BMF_BMF_SUBSECTION_CODE']==3)]))
print(len(set(merged[(merged['501c3']!=1)&(merged['BMF_BMF_SUBSECTION_CODE']==3)]['EIN'].tolist())))

5740
1324


In [106]:
merged[(merged['501c3']!=1)&(merged['BMF_BMF_SUBSECTION_CODE']==3)][['501c3', 'BMF_BMF_SUBSECTION_CODE',
                                                                     'BMF_ORG_NAME_CURRENT']].sample(10)

Unnamed: 0,501c3,BMF_BMF_SUBSECTION_CODE,BMF_ORG_NAME_CURRENT
338539,0,3.0,GERMANIA SINGING SOCIETY
1252833,0,3.0,BURLINGTON VOLUNTEER FIRE DEPT INC
1664908,0,3.0,CHRISTOPHER CLUB INC
270744,0,3.0,FAIRVIEW VOLUNTEER FIRE DEPARTMENT
1027169,0,3.0,CHESTERFIELD FIRE CO INC
552043,0,3.0,LEWISVILLE FIRE DEPARTMENT INC
550688,0,3.0,NEW FRANKLIN FIRE DEPARTMENT
1694123,0,3.0,BULLOCH COUNTY ALCOHOL AND DRUG ABUSE COUNCIL INC
836689,0,3.0,LOIS VOLUNTEER FIRE DEPARTMENT & RESCUE SQUAD INC
893660,0,3.0,ABINGTON FIRE COMPANY 255501


<br>Inspect one of the above organizations. This one checked '501(c)(3)' box for 2017 and 2016 (two latest years on IRS site) but not in 2015 and 2014.

The IRS determination letter from 1963 says they qualify for 501(c)(3) status: https://apps.irs.gov/app/eos/detailsPage?ein=911891203&name=District%20of%20Columbia%20Eastern%20Star%20Temple%20Building%20and%20Maintenance&city=Washington&state=DC&countryAbbr=US&dba=&type=CHARITIES,%20DETERMINATIONLETTERS,%20COPYOFRETURNS&orgTags=CHARITIES&orgTags=DETERMINATIONLETTERS&orgTags=COPYOFRETURNS

In [107]:
merged[merged['EIN']=='911891203'][['F9_00_HD_TAX_YEAR', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_ORG_NAME_CURRENT']]

Unnamed: 0,F9_00_HD_TAX_YEAR,501c3,BMF_BMF_SUBSECTION_CODE,BMF_ORG_NAME_CURRENT
544459,2013,0,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
724624,2014,0,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
2780529,2020,1,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
2790274,2020,1,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
3078083,2021,1,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
3228982,2022,1,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
3156299,2023,1,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
51546,2010,0,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
184232,2011,0,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE
339823,2012,0,3.0,DISTRICT OF COLUMBIA EASTERN STAR TEMPLE BUILDING AND MAINTENANCE


In [108]:
print(merged['BMF_BMF_SUBSECTION_CODE'].value_counts(), '\n')
print(merged['501c3'].value_counts())

BMF_BMF_SUBSECTION_CODE
3.0     2640582
6.0      226688
4.0      115446
5.0      104216
7.0       96271
9.0       54704
8.0       42043
19.0      40471
12.0      32827
14.0      27154
2.0       21376
13.0      20215
10.0      12494
25.0       4473
91.0       1937
15.0       1347
17.0        620
1.0         541
92.0        297
29.0        116
16.0         96
26.0         91
11.0         68
50.0         40
27.0         36
23.0         28
71.0         14
18.0         13
20.0         12
21.0          4
0.0           1
Name: count, dtype: int64 

501c3
1    2639594
0     832177
Name: count, dtype: int64


In [109]:
merged[:1]

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr,ein_int,BMF_EIN2,BMF_EIN,BMF_NTEE_IRS,BMF_NTEE_NCCS,BMF_NTEEV2,BMF_NCCS_LEVEL_1,BMF_NCCS_LEVEL_2,BMF_NCCS_LEVEL_3,BMF_F990_TOTAL_REVENUE_RECENT,BMF_F990_TOTAL_INCOME_RECENT,BMF_F990_TOTAL_ASSETS_RECENT,BMF_F990_ORG_ADDR_CITY,BMF_F990_ORG_ADDR_STATE,BMF_F990_ORG_ADDR_ZIP,BMF_F990_ORG_ADDR_STREET,BMF_CENSUS_CBSA_FIPS,BMF_CENSUS_CBSA_NAME,BMF_CENSUS_BLOCK_FIPS,BMF_CENSUS_URBAN_AREA,BMF_CENSUS_STATE_ABBR,BMF_CENSUS_COUNTY_NAME,BMF_ORG_ADDR_FULL,BMF_ORG_ADDR_MATCH,BMF_LATITUDE,BMF_LONGITUDE,BMF_GEOCODER_SCORE,BMF_GEOCODER_MATCH,BMF_BMF_SUBSECTION_CODE,BMF_BMF_STATUS_CODE,BMF_BMF_PF_FILING_REQ_CODE,BMF_BMF_ORGANIZATION_CODE,BMF_BMF_INCOME_CODE,BMF_BMF_GROUP_EXEMPT_NUM,BMF_BMF_FOUNDATION_CODE,BMF_BMF_FILING_REQ_CODE,BMF_BMF_DEDUCTIBILITY_CODE,BMF_BMF_CLASSIFICATION_CODE,BMF_BMF_ASSET_CODE,BMF_BMF_AFFILIATION_CODE,BMF_ORG_RULING_DATE,BMF_ORG_FISCAL_YEAR,BMF_ORG_RULING_YEAR,BMF_ORG_YEAR_FIRST,BMF_ORG_YEAR_LAST,BMF_ORG_YEAR_COUNT,BMF_ORG_PERS_ICO,BMF_ORG_NAME_SEC,BMF_ORG_NAME_CURRENT,BMF_ORG_FISCAL_PERIOD
3016720,65c1a1d52a9ba8ce45342904,,https://s3.amazonaws.com/irs-form-990/202323189349305317_public.xml,,,0,2023-04-26 12:10:37+00:00,2022,10017496,,,,"{'AddressLine1': None, 'AddressLine1Txt': 'PO BOX 534', 'AddressLine2': None, 'AddressLine2Txt': None, 'City': None, 'CityNm': 'YORK HARBOR', 'State': None, 'StateAbbreviationCd': 'ME', 'ZIPCd': '03911', 'ZIPCode': None}",,,{'BusinessNameLine1Txt': 'AGAMENTICUS YACHT CLUB INC'},AGAM,2073638510,,,0,0,,0,,1,0,,376800,0,0,0,DANIEL FORD,2023-11-13,,ME,2022-01-01,2022-12-31,2022,2023-11-14 16:30:26+00:00,0,1,0,,0,WWW.AYCSAIL.ORG,1937,0,279970,0,0,13,0,273331,0,0,0,0,0,184620,0,0,413907,0,2744,16,20,8282,0,0,0,13,0,0,34628,405625,"THE ORGANIZATION'S PRIMARY EXEMPT PURPOSE IS TO TEACH SAILING TO CHILDREN BY FOCUSING ON SAFETY, ENJOYMENT AND KNOWLEDGE OF SAILING.",132172,3377,54843,56026,0,273331,188198,0,372818,0,0,,"PROVIDES SAILING INSTRUCTION, SEAMAN-SHIP AND WATER SAFETY SKILLS TO CHILDREN.",167950,0,54843,,,0,0,0,,,0,0,0,0,0,0,167950,"THE ORGANIZATION'S PRIMARY EXEMPT PURPOSE IS TO TEACH YOUNGSTERS THE BASICS OF SAILING, SEAMAN-SHIP AND SAFE CONDUCT ON THE WATER. IT IS THE ORGANIZATION'S MISSION TO CREATE AND SUSTAIN A COMMUNITY OF FAMILIES WHO ENJOY BEING ON THE WATER.",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,16,2,0,1,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,13,13,0,0,0,0,0,,0,0,0,0,1,0,0,0,0,0,0,0,236965,0,0,0,0,0,0,0,0,0,3377,43005,0,54843,0,279970,0,54843,372818,0,0,0,0,0,16519,16519,0,0,0,84,21088,0,0,2776,1000,28263,0,1,0,0,0,0,0,0,0,0,0,0,6748,0,0,0,0,0,0,0,0,0,0,0,52045,52045,0,0,0,3981,3981,0,0,0,0,188198,2744,17504,167950,0,0,30000,0,174902,0,0,0,0,682,0,0,7600,0,0,0,0,29306,125682,0,0,0,83323,453817,278915,0,0,0,0,0,0,0,0,0,0,0,0,413907,0,0,0,0,184620,-52326,0,0,1,0,,0,0,0,0,0,,1,PO BOX 534,,YORK HARBOR,3911,,ME,2023,10017496,EIN-01-0017496,10017496,N50,N50,HMS-N50-RG,501C3 CHARITY,O,HS,372818.0,376800.0,413907.0,YORK HARBOR,ME,03911-0534,PO BOX 534,38860.0,"Portland-South Portland, ME",230310400000000.0,U,ME,York County,"PO BOX 534,YORK HARBOR,ME,03911-0534","03911-0534, York Harbor, Maine",43.134174,-70.640878,98.0,M,3.0,1.0,0.0,1.0,4.0,0.0,15.0,1.0,1.0,2000.0,4.0,3.0,1993-03,2024.0,1993.0,1995.0,2024.0,30.0,,,AGAMENTICUS YACHT CLUB OF YORK,3.0


In [110]:
%%time
dfg = merged[['EIN', '501c3', 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE']].groupby('EIN').nunique()
#dfg = dfg.drop('EIN', 1)
dfg = dfg.reset_index()
dfg.describe().T

CPU times: total: 1.77 s
Wall time: 1.94 s


Unnamed: 0,count,mean,std,min,25%,50%,75%,max
501c3,456945.0,1.002716,0.052043,1.0,1.0,1.0,1.0,2.0
BMF_BMF_SUBSECTION_CODE,456945.0,0.98225,0.132043,0.0,1.0,1.0,1.0,1.0
BMF_BMF_FOUNDATION_CODE,456945.0,0.979608,0.141337,0.0,1.0,1.0,1.0,1.0


In [111]:
print(len(dfg[dfg['501c3']>1]))
print(len(set(dfg[dfg['501c3']>1]['EIN'].tolist())))
print(len(merged[merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist())]))
print('# of 990 filings with >1 value for *501c3*:', len(merged[(merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist()))]))
merged[merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist())][['EIN', 'F9_00_HD_TAX_YEAR', '501c3', 
                                                                 'BMF_BMF_SUBSECTION_CODE', 'BMF_BMF_FOUNDATION_CODE']][:15]

1241
1241
12105
# of 990 filings with >1 value for *501c3*: 12105


Unnamed: 0,EIN,F9_00_HD_TAX_YEAR,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE
635901,10224718,2013,1,3.0,16.0
839636,10224718,2014,1,3.0,16.0
1138695,10224718,2015,1,3.0,16.0
1558534,10224718,2017,1,3.0,16.0
2029492,10224718,2018,1,3.0,16.0
2183137,10224718,2020,1,3.0,16.0
2613922,10224718,2021,1,3.0,16.0
2990851,10224718,2022,1,3.0,16.0
3178498,10224718,2023,1,3.0,16.0
226189,10224718,2010,0,3.0,16.0


In [112]:
print('# of 990 filings with >1 value for *501c3*:', len(merged[(merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist()))]))
#print('# of 990 filings with >1 value for *501c3*:', len(merged[(merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist())) & (merged['501c3']==1)]))

# of 990 filings with >1 value for *501c3*: 12105


In [113]:
%%time
print(len(merged[~(merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist()))]))
merged[~(merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist()))]['501c3'].value_counts()

3459666
CPU times: total: 36.9 s
Wall time: 38.5 s


501c3
1    2633270
0     826396
Name: count, dtype: int64

# Ended here 4/19/2025

In [114]:
mixed_501c3 = dfg[dfg['501c3']>1]['EIN'].tolist()
print(len(mixed_501c3))
print(mixed_501c3[:5])

1241
['010224718', '010377828', '010428565', '010458555', '010550458']


In [116]:
merged[merged['EIN'].isin(mixed_501c3[:2])][['EIN', 'F9_00_HD_TAX_YEAR', '501c3','BMF_BMF_SUBSECTION_CODE', 
                                             'BMF_BMF_FOUNDATION_CODE']]

Unnamed: 0,EIN,F9_00_HD_TAX_YEAR,501c3,BMF_BMF_SUBSECTION_CODE,BMF_BMF_FOUNDATION_CODE
635901,10224718,2013,1,3.0,16.0
839636,10224718,2014,1,3.0,16.0
1138695,10224718,2015,1,3.0,16.0
1558534,10224718,2017,1,3.0,16.0
2029492,10224718,2018,1,3.0,16.0
2183137,10224718,2020,1,3.0,16.0
2613922,10224718,2021,1,3.0,16.0
2990851,10224718,2022,1,3.0,16.0
3178498,10224718,2023,1,3.0,16.0
226189,10224718,2010,0,3.0,16.0


In [132]:
#print(len(merged[(~merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist())) &(merged['501c3']==1)]))
#print(len(set(merged[(~merged['EIN'].isin(dfg[dfg['501c3']>1]['EIN'].tolist())) &(merged['501c3']==1)]['EIN'].tolist())))

1433427
256478


In [117]:
print(len(merged[(~merged['EIN'].isin(mixed_501c3)) & (merged['501c3']==1)]))
print(len(set(merged[(~merged['EIN'].isin(mixed_501c3)) & (merged['501c3']==1)]['EIN'].tolist())))

2633270
351875


In [120]:
print(len(set(merged[merged['501c3']==1]['EIN'].tolist())))

353116


In [118]:
merged['501c3'].value_counts()

501c3
1    2639594
0     832177
Name: count, dtype: int64

In [119]:
2639594-2633270

6324

In [121]:
353116-351875

1241

<br> Based on the above, I will just take the EINs that are consistently '1' on *501c3* for all years.  The final dataset has 2,633,270 filings, 6,324 fewer than the unfiltered 2,639,594. In terms of EINs, that's a drop of 1,241 EINs -- from 353,116 down to 351,875. 

#### Save DF with just 501(c)(3) filings

In [122]:
len(merged)

3471771

In [123]:
%%time
merged_501c3 = merged[(~merged['EIN'].isin(mixed_501c3)) & (merged['501c3']==1)]
print(len(merged_501c3))
print(len(set(merged_501c3['EIN'].tolist())))
print(merged_501c3['501c3'].value_counts())

2633270
351875
501c3
1    2633270
Name: count, dtype: int64
CPU times: total: 7.83 s
Wall time: 8.15 s


In [124]:
len(merged_501c3)

2633270

In [125]:
gc.collect()

3098

In [129]:
%%time
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
merged_501c3.to_feather('D:/990_and_bmf_april_2025_all_controls_351875_orgs_2633270_filings.feather')

Current date and time :  2025-04-20 17:56:14 

CPU times: total: 39.3 s
Wall time: 35.2 s


In [130]:
%%time
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
merged_501c3.to_parquet("D:/990_and_bmf_april_2025_all_controls_351875_orgs_2633270_filings.parquet", engine="pyarrow", compression="snappy", index=False)

Current date and time :  2025-04-20 18:22:36 

CPU times: total: 1min 18s
Wall time: 1min 23s


In [131]:
%%time
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
merged_501c3.to_pickle('990_and_bmf_april_2025_all_controls_351875_orgs_2633270_filings.pkl.gz', compression='gzip')

Current date and time :  2025-04-20 18:24:00 

CPU times: total: 35min 57s
Wall time: 36min 58s


In [82]:
#%%time
#import datetime
#print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
#merged_501c3.to_pickle('990 and BMF control variables for all NEW filings February 2024 -- 277,112 501c3 orgs (N=664,761).pkl.gz', compression='gzip')

Current date and time :  2024-03-31 17:38:39 

CPU times: total: 10min 49s
Wall time: 11min 11s
