#### Update of this notebook:
- *IRS Form 990 e-File Data (6a) -- Change Data Types.ipynb*

### Note
These two variables are ``float64`` instead of ``obj`` -- they must all be missing ``['F9_03_PC_PROG_SVC_ACC_2_CODE', 'F9_03_PC_PROG_SVC_ACC_3_CODE']``

# Overview

Main purpose of notebook:
- All of the variables are currently in 'string' format, so in this notebook I change the data type for relevant variables to a *timestamp* or *int* or *float*. Implementing new approach here for changing data types.

Steps:
- Read in *concordance_VERIFIED.xlsx* in order to access the *python_data_type* column
    - Collapse to *new_variables_df* then use that DF

- Read in e-file DF: 
    - *all filings may 2021 - all control variables (with parsed sub-key variables).pkl.gz*

- Change data types for *DateTime* and *Int64* variables:
    - string variables -- nothing done
	- DateTime variables -- change both relevant variables
	- Int64 variables -- change all with one-liner: 
	    - df[Int64_vars] = df[Int64_vars].apply(pd.to_numeric)
        - This one-liner averts issues with converting to 'Int64' -- it chooses either 'Int64' or 'float'

- Save DF:
	- *all filings may 2021 - all control variables (with parsed sub-key variables and reformatted types).pkl.gz*

Note:
- Even though I use 'Int64' as the value for all integer columns in *python_data_type*, the *pd.numeric( )* code chooses whether to convert to 'Int64' or 'float'	
- Three variables -- *fiscal_year*,  *TaxPeriod*, and *F9_00_HD_TAX_PER_END* -- are all based off the same date, which is the *END* of the tax period, while *F9_00_HD_TAX_YEAR* reflects the year in which the tax period *BEGINS*. See *IRS 990 e-File Data -- CONTROL VARIABLES (A2) -- Combine Columns (Python 3.6).ipynb*
- In a previous version of this notebook I looped over all variables in each *data_type_xsd* (e.g., DateType, StateType, etc.) and inspected the *dtypes* and sample values for these variables. Based on that I added the *python_data_type* column to the concordance file:
    - See *IRS 990 e-File Data -- CONTROL VARIABLES (A4-1a) -- Change Data Types (with data type verification code -- no longer needed).ipynb*

# Load Packages and Connect to MongoDB

In [1]:
import numpy as np
import pandas as pd
from pandas import DataFrame
from pandas import Series

In [2]:
print(pd.__version__)

2.2.2


In [3]:
#http://pandas.pydata.org/pandas-docs/stable/options.html
pd.set_option('display.max_columns', None)
pd.set_option('max_colwidth', 250)

In [4]:
import datetime

In [10]:
import gc

#### Set working directory

In [6]:
cd "C:\\Users\\Gregory\\IRS 990 Control Variables\\"

C:\Users\Gregory\IRS 990 Control Variables


# Read in Concordance File
We are going to read in the 'concordance' file, which is a specialized codebook and contains details about each data column in the e-file data.

In [7]:
%%time
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
concordance = pd.read_excel('concordance_VERIFIED.xlsx')
print('# of columns:', len(concordance.columns))
print('# of observations:', len(concordance))
concordance[:2]

Current date and time :  2025-04-18 12:12:31 

# of columns: 17
# of observations: 574
CPU times: total: 531 ms
Wall time: 2.32 s


Unnamed: 0,xpath,variable_name_new,# of Characters (newly named),variable name notes,PARSING NOTES,OTHER NOTES,description,location_code,part,data_type_xsd,python_data_type,fill_null,BINARIZE,MongoDB_Name,sub_key,sub_sub_key,cardinality
0,/Return/ReturnData/IRS990/SpecialConditionDesc,F9_00_HD_SPECIAL_CONDITION_DESC,,,,,Special condition description,F990-PC-PART-00,PART-00,TextType,string,Do not fill null,,SpecialConditionDesc,,,
1,/Return/ReturnData/IRS990/SpecialConditionDescription,F9_00_HD_SPECIAL_CONDITION_DESC,31.0,,,,Special condition description,F990-PC-PART-00,PART-00,TextType,string,Do not fill null,,SpecialConditionDescription,,,


In [8]:
concordance[concordance['sub_key'].notnull()][['variable_name_new', 'MongoDB_Name', 'sub_key']]

Unnamed: 0,variable_name_new,MongoDB_Name,sub_key
122,F9_03_PC_PROG_SVC_ACC_2_CODE,Activity2,ActivityCode
123,F9_03_PC_PROG_SVC_ACC_2_CODE,ProgSrvcAccomActy2Grp,ActivityCode
124,F9_03_PC_PROG_SVC_ACC_3_CODE,Activity3,ActivityCode
125,F9_03_PC_PROG_SVC_ACC_3_CODE,ProgSrvcAccomActy3Grp,ActivityCode
146,F9_03_PC_PROG_SVC_ACC_2_DESC,Activity2,Description
...,...,...,...
556,F9_00_HD_FILER_CITY_US,Filer,USAddress
557,F9_00_HD_FILER_COUNTRY_FRGN,Filer,ForeignAddress
558,F9_00_HD_FILER_COUNTRY_FRGN,Filer,ForeignAddress
559,F9_00_HD_FILER_ZIP_US,Filer,USAddress


# Read 990 DB into PANDAS DF
Read in the PANDAS dataframe from the previous notebook.

In [9]:
#%%time
#import datetime
#print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
#df = pd.read_pickle('all NEW filings February 2024 - all control variables (with parsed sub-key variables).pkl.gz', compression='gzip')
#print('# of columns:', len(df.columns))
#print('# of observations:', len(df))
#df[:1]

In [12]:
%%time
import datetime
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
df = pd.read_feather('D:/all_filings_april_2025_all_controls_combined_parsed.feather')
print('# of columns:', len(df.columns))
print('# of observations:', len(df))
df[:1]

Current date and time :  2025-04-18 12:13:09 

# of columns: 305
# of observations: 3469008
CPU times: total: 2min 43s
Wall time: 2min 19s


Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US
0,5d019e6778ffca27b42818d7,RONALD MCDONALD HOUSE CHARITIES- PHILADELPHIA REGION INC,https://s3.amazonaws.com/irs-form-990/201113139349301301_public.xml,93493313013011,201012,,2016-02-24 21:20:13Z,,232705170,"{'BusinessNameLine1': 'RONALD MCDONALD HOUSE CHARITIES-', 'BusinessNameLine2': 'PHILADELPHIA REGION INC'}",RONA,8565826843,"{'AddressLine1': '1525 VALLEY CENTER PARKWAY NO 300', 'AddressLine1Txt': None, 'AddressLine2': None, 'AddressLine2Txt': None, 'City': 'BETHLEHEM', 'CityNm': None, 'State': 'PA', 'StateAbbreviationCd': None, 'ZIPCd': None, 'ZIPCode': '18017'}",,,,,,,,1,0,,0,,1,0,,1473903,0,0,0,MICHAEL ANTON,2011-11-04,,PA,2010-01-01,2010-12-31,2010,2011-11-09T06:41:09-06:00,0,1,0,,0,,1992,0,1439340,1044925,638637,10,30447,1753405,243131,0,0,0,0,89152,193604,0,2440859,881768,195892,0,0,450430,1075372,0,0,10,0,925000,33563,1990429,MAKES GRANTS TO NON-PROFITS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN.,459751,1000,0,0,0,1925215,1384751,171810,1473903,0,0,,"RMHC OF THE PHILADELPHIA REGION, INC. GRANTS HUNDREDS OF THOUSANDS OF DOLLARS PER YEAR TO SUPPORT NON-PROFIT PROGRAMS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN. LOCALLY, RMHC SUPPORTS THE PHILADELPHIA, SOUTHERN NEW JERSEY AND DE...",1043744,925000,,,,,,,,,,,,,,,1043744,"THE CORPORATION IS ORGANIZED AND WILL BE OPERATED EXCLUSIVELY FOR CHARITABLE, EDUCATIONAL AND SCIENTIFIC PURPOSES WITHIN THE MEANING OF SECTION 501(C)(3) OF THE INTERNAL REVENUE CODE. SUCH PURPOSES SHALL BE LIMITED TO PROVIDING SUPPORT AND FUNDIN...",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,10,10,0,0,0,0,0,"[""PA"", ""NJ"", ""DE""]",0,0,0,0,1,0,0,0,0,0,0,0,1439340,,,,,,,,,,,,,,,1439340,1000,,1473903,,,,,86228,,86228,,33000,892000,,,,,,123,763,,0,,,,,,,,,,,21675,,215,,,,,,,,,,,,118744,,,,,,,,,1384751,195892,145115,1043744,147981,,,,170617,,,,,44353,166000,,,,,1990429,,,,,1851561,,,256845,86228,,1,0,240077,,332660,270700,,,,,,2440859,0,,,,89152,,1,0,1,0,,1,0,0,1,1,,1,1525 VALLEY CENTER PARKWAY NO 300,,BETHLEHEM,18017,,PA


# Combine variables for alternative concordance file
Here we are 'collapsing' the concordance file to have one row per variable.

In [13]:
#def agg_funcs(x):
#    names = {
#        'original_names':  list(set(x['MongoDB_Name'].tolist())),
#        'python_data_type': x['python_data_type'].head(1).values[0]
#    }
#    #THE FOLLOWING SHORTCUT WORKS BUT CHANGES THE ORDER OF THE COLUMNS
#    #return pd.Series(names, index = list(names.keys()))
#    return pd.Series(names, index=['original_names', 'python_data_type'])
#new_variables_df = concordance.groupby(['variable_name_new']).apply(agg_funcs)
#new_variables_df = new_variables_df.reset_index()
#print('# of variables:', len(new_variables_df))
#new_variables_df[:]

# of variables: 288


Unnamed: 0,variable_name_new,original_names,python_data_type
0,F9_00_HD_ADDR_CHANGE,"[AddressChangeInd, AddressChange]",Int64
1,F9_00_HD_AMENDED_RETURN,"[AmendedReturn, AmendedReturnInd]",Int64
2,F9_00_HD_BUILD_TIME_STAMP,[BuildTS],DateTime
3,F9_00_HD_CTRY_OF_DOMICILE,"[CountryLegalDomicile, LegalDomicileCountryCd]",string
4,F9_00_HD_EXEMPT_STATUS_4847A1,"[Organization4947a1, Organization4947a1NotPFInd]",Int64
...,...,...,...
283,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,"[FederalGrantAuditRequiredInd, FederalGrantAuditRequired]",Int64
284,F9_12_PC_FINCL_STMTS_AUDITED,"[FSAuditedInd, FSAudited]",Int64
285,F9_12_SCHED_O_X,"[InfoInScheduleOPartXII, InfoInScheduleOPartXIIInd]",Int64
286,TaxPeriod,[TaxPeriod],string


In [13]:
%%time
new_variables_df = (
    concordance
    .groupby('variable_name_new')
    .agg(
        original_names=('MongoDB_Name', lambda x: list(set(x))),
        python_data_type=('python_data_type', 'first')
    )
    .reset_index()
  )

print('# of variables:', len(new_variables_df))
new_variables_df

# of variables: 288
CPU times: total: 0 ns
Wall time: 17.5 ms


Unnamed: 0,variable_name_new,original_names,python_data_type
0,F9_00_HD_ADDR_CHANGE,"[AddressChangeInd, AddressChange]",Int64
1,F9_00_HD_AMENDED_RETURN,"[AmendedReturnInd, AmendedReturn]",Int64
2,F9_00_HD_BUILD_TIME_STAMP,[BuildTS],DateTime
3,F9_00_HD_CTRY_OF_DOMICILE,"[CountryLegalDomicile, LegalDomicileCountryCd]",string
4,F9_00_HD_EXEMPT_STATUS_4847A1,"[Organization4947a1NotPFInd, Organization4947a1]",Int64
...,...,...,...
283,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,"[FederalGrantAuditRequiredInd, FederalGrantAuditRequired]",Int64
284,F9_12_PC_FINCL_STMTS_AUDITED,"[FSAudited, FSAuditedInd]",Int64
285,F9_12_SCHED_O_X,"[InfoInScheduleOPartXIIInd, InfoInScheduleOPartXII]",Int64
286,TaxPeriod,[TaxPeriod],string


In [14]:
df.loc[[695015]]

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US
695015,5d051ab978ffca27b432b3c2,HOUNDHAVEN INC,https://s3.amazonaws.com/irs-form-990/201522119349301137_public.xml,93493211011375,201412,,2016-02-25 16:41:14Z,2014,593655448,,,,"{'AddressLine1': None, 'AddressLine1Txt': 'P O Box 185', 'AddressLine2': None, 'AddressLine2Txt': None, 'City': None, 'CityNm': 'Minneola', 'State': None, 'StateAbbreviationCd': 'FL', 'ZIPCd': '34755', 'ZIPCode': None}",,,{'BusinessNameLine1Txt': 'Houndhaven Inc'},HOUN,,,,0,0,,0,,1,0,,181996,0,0,0,Linda Coletta,2015-07-30,,FL,2014-01-01,2014-12-31,2014,2015-07-30T19:28:29-04:00,0,1,0,,0,www.houndhaven.org,2000,,126430,145090,,3,420,236341,138173,37883,0,,23691,53296,68911,0,289637,138173,0,0,50,0,207084,0,0,5,0,0,541,289637,"Houndhaven, Inc. rescues dogs and puppies from life threatening circumstances, rehabilitates them and then places them in loving homes. We also educate the public about pet overpopulation and the importance of spaying and neutering their pets.",127269,28755,24839,0,,236341,127269,,180565,0,0,,"Houndhaven, Inc. rescues dogs and puppies from euthaniasia at kill shelters and other life threatening circumstances. We give them whatever medical attention they need and care for them until they can be placed in loving homes or with another res...",106501,,24839,,,,,,,,,,,,,,106501,"Houndhaven, Inc. rescues dogs and puppies from life threatening circumstances, rehabilitates them and then places them in loving homes. We also educate the public about pet overpopulation and the importance of spaying and neutering their pets.",0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,3,5,0,0,0,0,0,"""FL""",0,0,0,0,1,,0,0,0,0,0,0,126430,,1108,,323,,23035,,,,7033,,,24839,,126430,118,24839,180565,313,,,,12986,,12986,,,,,5533,,,15800,1467,,,1,,14426,,,,,,,,,,,,,,67080,,,,,,,,,,,,,,,,,,127269,0,20768,106501,,,,,28710,,,,,,,,,,,289637,,194645,260927,,,,,41696,12986,,1,0,,,,,,,,,,289637,0,,,,53296,,0,0,0,1,,0,0,0,0,0,,1,P O Box 185,,Minneola,34755,,FL


# Change Data Types

In [15]:
new_variables_df['python_data_type'].value_counts()

python_data_type
Int64       262
string       24
DateTime      2
Name: count, dtype: int64

### String variables

In [16]:
string_vars = new_variables_df[new_variables_df['python_data_type']=='string']['variable_name_new'].tolist()
print(len(string_vars))
print(string_vars)

24
['F9_00_HD_CTRY_OF_DOMICILE', 'F9_00_HD_FILER_ADDR_US_L1', 'F9_00_HD_FILER_ADDR_US_L2', 'F9_00_HD_FILER_CITY_US', 'F9_00_HD_FILER_COUNTRY_FRGN', 'F9_00_HD_FILER_STATE_US', 'F9_00_HD_FILER_ZIP_US', 'F9_00_HD_GROSS_EXEMPT_NUM', 'F9_00_HD_PRIN_OFF_NAME', 'F9_00_HD_SIGNING_OFFICER_SIGNTR', 'F9_00_HD_SPECIAL_CONDITION_DESC', 'F9_00_HD_STATE_OF_DOMICILE', 'F9_00_HD_TAX_PER_BEGIN', 'F9_00_HD_TAX_PER_END', 'F9_00_HD_TYPE_ORG_OTHER_DESC', 'F9_00_HD_WEBSITE', 'F9_01_PZ_ORGANIZATIONAL_MISSION', 'F9_03_PC_PROG_SVC_ACC_1_DESC', 'F9_03_PC_PROG_SVC_ACC_2_DESC', 'F9_03_PC_PROG_SVC_ACC_3_DESC', 'F9_03_PZ_MISSION_DESCRIPTION', 'F9_06_PC_STATES_WHERE_RET_FILED', 'F9_12_PC_ACCTG_METHOD_OTHER', 'TaxPeriod']


In [17]:
string_vars.remove('TaxPeriod')

In [18]:
#THIS VARIABLE WAS float64
string_vars.remove('F9_00_HD_FILER_ADDR_US_L2')

In [19]:
df[string_vars].dtypes

F9_00_HD_CTRY_OF_DOMICILE          object
F9_00_HD_FILER_ADDR_US_L1          object
F9_00_HD_FILER_CITY_US             object
F9_00_HD_FILER_COUNTRY_FRGN        object
F9_00_HD_FILER_STATE_US            object
F9_00_HD_FILER_ZIP_US              object
F9_00_HD_GROSS_EXEMPT_NUM          object
F9_00_HD_PRIN_OFF_NAME             object
F9_00_HD_SIGNING_OFFICER_SIGNTR    object
F9_00_HD_SPECIAL_CONDITION_DESC    object
F9_00_HD_STATE_OF_DOMICILE         object
F9_00_HD_TAX_PER_BEGIN             object
F9_00_HD_TAX_PER_END               object
F9_00_HD_TYPE_ORG_OTHER_DESC       object
F9_00_HD_WEBSITE                   object
F9_01_PZ_ORGANIZATIONAL_MISSION    object
F9_03_PC_PROG_SVC_ACC_1_DESC       object
F9_03_PC_PROG_SVC_ACC_2_DESC       object
F9_03_PC_PROG_SVC_ACC_3_DESC       object
F9_03_PZ_MISSION_DESCRIPTION       object
F9_06_PC_STATES_WHERE_RET_FILED    object
F9_12_PC_ACCTG_METHOD_OTHER        object
dtype: object

<br>The above variables are already in *string* format so they can be left alone.

### *DateTime* columns
There are 2 *datetime* columns. We will process each in turn.

In [20]:
DateTime_vars = new_variables_df[new_variables_df['python_data_type']=='DateTime']['variable_name_new'].tolist()
print(len(DateTime_vars))
print(DateTime_vars)

2
['F9_00_HD_BUILD_TIME_STAMP', 'F9_00_HD_TIME_STAMP']


In [21]:
df[DateTime_vars].dtypes

F9_00_HD_BUILD_TIME_STAMP    object
F9_00_HD_TIME_STAMP          object
dtype: object

In [22]:
df[DateTime_vars].sample(5)

Unnamed: 0,F9_00_HD_BUILD_TIME_STAMP,F9_00_HD_TIME_STAMP
1563444,2019-02-21 02:37:17Z,2019-03-16T07:34:22-07:00
2095050,2021-01-29 14:40:06Z,2020-11-17T08:13:43-06:00
2016786,2021-01-29 14:40:06Z,2020-02-21T15:56:22-06:00
717255,2016-03-07 17:11:31Z,2015-09-28T14:08:55-05:00
1753433,2020-03-31 21:24:44Z,2019-11-01T14:46:41-05:00


##### F9_00_HD_BUILD_TIME_STAMP

In [23]:
pd.to_datetime(df.sample(10)['F9_00_HD_BUILD_TIME_STAMP'])

1784255   2020-03-31 21:24:44+00:00
3263460   2024-10-15 13:58:12+00:00
1771433   2020-03-31 21:24:44+00:00
1276081   2017-02-10 21:41:12+00:00
3060024   2023-04-26 12:10:37+00:00
3342595   2024-10-15 13:58:12+00:00
3255894   2023-04-26 12:10:37+00:00
2435905   2022-09-23 18:48:47+00:00
300198    2016-03-07 17:11:31+00:00
1988560   2021-01-29 14:40:06+00:00
Name: F9_00_HD_BUILD_TIME_STAMP, dtype: datetime64[ns, UTC]

In [24]:
print(len(df[df['F9_00_HD_BUILD_TIME_STAMP'].isnull()]))

0


In [25]:
%%time
df['F9_00_HD_BUILD_TIME_STAMP'] = pd.to_datetime(df['F9_00_HD_BUILD_TIME_STAMP'])

CPU times: total: 391 ms
Wall time: 450 ms


In [26]:
df.sample(5)['F9_00_HD_BUILD_TIME_STAMP']

1417650   2018-06-14 16:35:46+00:00
838401    2016-04-25 22:37:26+00:00
2855860   2023-04-26 12:10:37+00:00
1272961   2017-02-10 21:41:12+00:00
295587    2016-02-25 16:41:14+00:00
Name: F9_00_HD_BUILD_TIME_STAMP, dtype: datetime64[ns, UTC]

In [27]:
df['F9_00_HD_BUILD_TIME_STAMP'].min()

Timestamp('2015-11-30 17:44:51+0000', tz='UTC')

In [28]:
df['F9_00_HD_BUILD_TIME_STAMP'].max()

Timestamp('2025-03-06 01:10:19+0000', tz='UTC')

##### F9_00_HD_TIME_STAMP

In [29]:
df.sample(5)['F9_00_HD_TIME_STAMP']

2467157    2021-09-26T16:03:22-07:00
985337     2016-09-09T15:41:35-05:00
2490646    2021-12-11T09:40:05-06:00
279914     2013-02-05T09:48:40-06:00
1438826    2018-05-15T14:03:00-06:00
Name: F9_00_HD_TIME_STAMP, dtype: object

In [30]:
print(len(df[df['F9_00_HD_TIME_STAMP'].isnull()]))

0


In [45]:
%%time
#df['F9_00_HD_TIME_STAMP'] = pd.to_datetime(df['F9_00_HD_TIME_STAMP'])
df['F9_00_HD_TIME_STAMP'] = pd.to_datetime(df['F9_00_HD_TIME_STAMP'], utc=True)

CPU times: total: 8.23 s
Wall time: 8.64 s


<br>Function to deal with missing values if needed -- See *IRS 990 e-File Preparer Data -- (9n) -- Schedule O - Number of Words and Time Delta Variables.ipynb*

In [34]:
#def timefunc(x):
#    if pd.isnull(x):
#        return np.nan
#    else: 
#        return pd.to_datetime(x)        

In [279]:
#%%time
#df['F9_00_HD_TIME_STAMP_dt'] = df['F9_00_HD_TIME_STAMP'][:].apply(timefunc)
###df['F9_09_PC_TOTAL_FUNC_EXPENSES'] = df['F9_09_PC_TOTAL_FUNC_EXPENSES'].astype('float')
#df[DateTime_vars][:6]

Unnamed: 0,F9_00_HD_BUILD_TIME_STAMP,F9_00_HD_TIME_STAMP
0,2016-02-24 21:20:13+00:00,2011-11-09 06:41:09-06:00
1,2016-02-24 21:20:13+00:00,2011-11-09 07:32:06-08:00
2,2016-02-24 21:20:13+00:00,2011-11-09 07:33:03-08:00
3,2016-02-24 21:20:13+00:00,2011-11-09 07:54:44-08:00
4,2016-02-24 21:20:13+00:00,2011-11-09 10:05:52-06:00
5,2016-02-24 21:20:13+00:00,2011-11-09 08:42:28-08:00


In [35]:
df.sample(5)['F9_00_HD_TIME_STAMP']

2101424    2020-11-16 17:06:25-06:00
1696300    2019-09-23 09:20:52-07:00
2952923    2023-09-25 05:42:39-07:00
2875065    2023-06-16 19:03:45+00:00
2030012    2021-05-13 14:37:33-05:00
Name: F9_00_HD_TIME_STAMP, dtype: object

In [36]:
for index, row in df[:5].iterrows():
    print(type(row['F9_00_HD_BUILD_TIME_STAMP']), type(row['F9_00_HD_TIME_STAMP']))
    print(row['F9_00_HD_BUILD_TIME_STAMP'].year)
    print(row['F9_00_HD_TIME_STAMP'].year)
    #print(row['F9_00_HD_TIME_STAMP_dt'].year)    

<class 'pandas._libs.tslibs.timestamps.Timestamp'> <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2016
2011
<class 'pandas._libs.tslibs.timestamps.Timestamp'> <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2016
2011
<class 'pandas._libs.tslibs.timestamps.Timestamp'> <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2016
2011
<class 'pandas._libs.tslibs.timestamps.Timestamp'> <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2016
2011
<class 'pandas._libs.tslibs.timestamps.Timestamp'> <class 'pandas._libs.tslibs.timestamps.Timestamp'>
2016
2011


In [37]:
df['F9_00_HD_TIME_STAMP'].min()

Timestamp('2002-01-01 03:19:19-0500', tz='UTC-05:00')

In [38]:
df['F9_00_HD_TIME_STAMP'].max()

Timestamp('2025-02-20 23:50:42-0500', tz='UTC-05:00')

In [40]:
%%time
df['F9_00_HD_TIME_STAMP_yr'] = df['F9_00_HD_TIME_STAMP'].apply(lambda x: str(x)[:4])
print(df['F9_00_HD_TIME_STAMP_yr'].value_counts().sort_index(), '\n')
df[['F9_00_HD_TIME_STAMP_yr']][:2]

F9_00_HD_TIME_STAMP_yr
2002         1
2009       183
2010      1333
2011    117559
2012    142315
2013    175554
2014    193702
2015    211775
2016    228991
2017    239015
2018    252071
2019    261282
2020    273462
2021    305355
2022    331136
2023    347755
2024    358034
2025     29485
Name: count, dtype: int64 

CPU times: total: 14.2 s
Wall time: 14.8 s


Unnamed: 0,F9_00_HD_TIME_STAMP_yr
0,2011
1,2011


In [41]:
df[df['F9_00_HD_TIME_STAMP_yr']=='2002'][['EIN', 'fiscal_year', '501c3', 'F9_00_HD_BUILD_TIME_STAMP',
                                          'F9_00_HD_TIME_STAMP', 'F9_00_HD_TIME_STAMP_yr']]

Unnamed: 0,EIN,fiscal_year,501c3,F9_00_HD_BUILD_TIME_STAMP,F9_00_HD_TIME_STAMP,F9_00_HD_TIME_STAMP_yr
234090,10838477,,1,2016-02-24 21:20:13+00:00,2002-01-01 03:19:19-05:00,2002


In [46]:
df[DateTime_vars].dtypes

F9_00_HD_BUILD_TIME_STAMP    datetime64[ns, UTC]
F9_00_HD_TIME_STAMP          datetime64[ns, UTC]
dtype: object

<br>
*NOTE:* There is one row with a time stamp <2009

### *Int64* columns

In [47]:
new_variables_df[new_variables_df['variable_name_new'].isin(['F9_09_EXP_OTH_OTH_TOT', 'F9_09_EXP_OTH_TOT'])]

Unnamed: 0,variable_name_new,original_names,python_data_type
190,F9_09_EXP_OTH_OTH_TOT,"[AllOtherExpensesGrp, AllOtherExpenses]",Int64
191,F9_09_EXP_OTH_TOT,"[OtherExpenses, OtherExpensesGrp]",Int64


In [46]:
#new_variables_df = new_variables_df.drop([186,187])
#new_variables_df = new_variables_df.drop([190,191])

In [48]:
variables_to_drop = ['F9_09_EXP_OTH_OTH_TOT', 'F9_09_EXP_OTH_TOT']
print(len(new_variables_df))
new_variables_df = new_variables_df[~new_variables_df['variable_name_new'].isin(variables_to_drop)]
print(len(new_variables_df))

288
286


In [49]:
Int64_vars = new_variables_df[new_variables_df['python_data_type']=='Int64']['variable_name_new'].tolist()
print(len(Int64_vars))
print(Int64_vars)

260
['F9_00_HD_ADDR_CHANGE', 'F9_00_HD_AMENDED_RETURN', 'F9_00_HD_EXEMPT_STATUS_4847A1', 'F9_00_HD_EXEMPT_STATUS_501C', 'F9_00_HD_EXEMPT_STATUS_501C3', 'F9_00_HD_FINAL_RETURN', 'F9_00_HD_GROSS_RCPT', 'F9_00_HD_GROUP_RETURN', 'F9_00_HD_INCLUDES_SUBORD_ORGS', 'F9_00_HD_INITIAL_RETURN', 'F9_00_HD_TAX_YEAR', 'F9_00_HD_TYPE_ORG_ASSOCIATION', 'F9_00_HD_TYPE_ORG_CORP', 'F9_00_HD_TYPE_ORG_OTHER', 'F9_00_HD_TYPE_ORG_TRUST', 'F9_00_HD_YEAR_FORMED', 'F9_01_PC_BEN_PAID_MEMB_PRIOR', 'F9_01_PC_CONTR_GRANTS_CURR', 'F9_01_PC_CONTR_GRANTS_PRIOR', 'F9_01_PC_GRANTS_PRIOR', 'F9_01_PC_INDEP_VOTING_MEMB', 'F9_01_PC_INVEST_INCOME_PRIOR', 'F9_01_PC_NET_ASSETS_BOY', 'F9_01_PC_OTHER_EXPENSE_PRIOR', 'F9_01_PC_OTHER_REV_PRIOR', 'F9_01_PC_PROF_FUNDRISING_EXP_CURR', 'F9_01_PC_PROF_FUNDRISING_EXP_PRIOR', 'F9_01_PC_PROG_SERVICE_REV_PRIOR', 'F9_01_PC_REV_LESS_EXP_CURR', 'F9_01_PC_REV_LESS_EXP_PRIOR', 'F9_01_PC_TERMINATION_CONTRACTION', 'F9_01_PC_TOT_ASSETS_EOY', 'F9_01_PC_TOT_EXP_PRIOR', 'F9_01_PC_TOT_FNDR_EXP_CURR', 

In [50]:
df[Int64_vars].dtypes[:50]

F9_00_HD_ADDR_CHANGE                   int64
F9_00_HD_AMENDED_RETURN                int64
F9_00_HD_EXEMPT_STATUS_4847A1          int64
F9_00_HD_EXEMPT_STATUS_501C           object
F9_00_HD_EXEMPT_STATUS_501C3           int64
F9_00_HD_FINAL_RETURN                  int64
F9_00_HD_GROSS_RCPT                   object
F9_00_HD_GROUP_RETURN                  int64
F9_00_HD_INCLUDES_SUBORD_ORGS          int64
F9_00_HD_INITIAL_RETURN                int64
F9_00_HD_TAX_YEAR                     object
F9_00_HD_TYPE_ORG_ASSOCIATION          int64
F9_00_HD_TYPE_ORG_CORP                 int64
F9_00_HD_TYPE_ORG_OTHER                int64
F9_00_HD_TYPE_ORG_TRUST                int64
F9_00_HD_YEAR_FORMED                  object
F9_01_PC_BEN_PAID_MEMB_PRIOR          object
F9_01_PC_CONTR_GRANTS_CURR            object
F9_01_PC_CONTR_GRANTS_PRIOR           object
F9_01_PC_GRANTS_PRIOR                 object
F9_01_PC_INDEP_VOTING_MEMB            object
F9_01_PC_INVEST_INCOME_PRIOR          object
F9_01_PC_N

<br>These two variables are ``float64`` instead of ``obj`` -- they must all be missing ``['F9_03_PC_PROG_SVC_ACC_2_CODE', 'F9_03_PC_PROG_SVC_ACC_3_CODE']``

In [51]:
df[Int64_vars].dtypes[50:100]

F9_01_PZ_TOT_ASSETS_BOY             object
F9_01_PZ_TOT_EXP_CURR               object
F9_01_PZ_TOT_LIAB_BOY               object
F9_01_PZ_TOT_REV_CURR               object
F9_03_PC_PGMSVC_SIGNIF_CHG           int64
F9_03_PC_PGMSVC_SIGNIF_NEW           int64
F9_03_PC_PROG_SVC_ACC_1_CODE        object
F9_03_PC_PROG_SVC_ACC_1_EXP         object
F9_03_PC_PROG_SVC_ACC_1_GRNT        object
F9_03_PC_PROG_SVC_ACC_1_REV         object
F9_03_PC_PROG_SVC_ACC_2_CODE        object
F9_03_PC_PROG_SVC_ACC_2_EXP         object
F9_03_PC_PROG_SVC_ACC_2_GRNT        object
F9_03_PC_PROG_SVC_ACC_2_REV         object
F9_03_PC_PROG_SVC_ACC_3_CODE        object
F9_03_PC_PROG_SVC_ACC_3_EXP         object
F9_03_PC_PROG_SVC_ACC_3_GRNT        object
F9_03_PC_PROG_SVC_ACC_3_REV         object
F9_03_PC_TOT_OTH_PROG_SVC_EXP       object
F9_03_PC_TOT_OTH_PROG_SVC_GRNT      object
F9_03_PC_TOT_OTH_PROG_SVC_REV       object
F9_03_PC_TOT_PROG_SVC_EXPENSE       object
F9_03_PZ_SCHEDULE_O_PART3            int64
F9_04_PC_AC

In [52]:
df[Int64_vars].dtypes[100:150]

F9_06_PC_ELECTION_BOARD_MEMBERS       int64
F9_06_PC_FAMILY_OR_BUSINESS_REL       int64
F9_06_PC_FORM_AVAIL_OWN_WEBSITE       int64
F9_06_PC_FORM_UPON_REQUEST            int64
F9_06_PC_JOINT_VENTURE_INVESTMNT      int64
F9_06_PC_JOINT_VENTURE_POLICY         int64
F9_06_PC_LOCAL_CHAPTERS               int64
F9_06_PC_MATERIAL_DIVERSION           int64
F9_06_PC_MEMBERS_OR_STOCKHOLDERS      int64
F9_06_PC_MINUTES_COMMITTEES           int64
F9_06_PC_MINUTES_GOVERNING_BODY       int64
F9_06_PC_MONITORING_OF_COI_POLICY     int64
F9_06_PC_NUM_IND_VOTING_MEMBERS      object
F9_06_PC_NUM_VOTING_GOV_MEMBERS      object
F9_06_PC_OFFICER_MAILING_ADDRESS      int64
F9_06_PC_OTHER_COMPENSTN_PROCESS      int64
F9_06_PC_OTHER_WEBSITE                int64
F9_06_PC_OWN_WEBSITE                  int64
F9_06_PC_POLICIES_GOVERN_CHAPTER      int64
F9_06_PC_WHISTLEBLOWER_POLICY         int64
F9_07_EXP_SCHED_O_X                   int64
F9_07_PC_COMPENSATION_OTHER_SRCE      int64
F9_07_PC_FORMER_OFFICER_LISTED  

In [53]:
df[Int64_vars].dtypes[150:200]

F9_09_EXP_AD_PROMO_TOT              object
F9_09_EXP_BENF_PAID_MEMB_TOT        object
F9_09_EXP_CONF_MEETING_TOT          object
F9_09_EXP_DEPREC_FUNDR              object
F9_09_EXP_DEPREC_MAG                object
F9_09_EXP_DEPREC_PROG               object
F9_09_EXP_DEPREC_TOT                object
F9_09_EXP_GRANT_FRGN_TOT            object
F9_09_EXP_GRANT_INDIV_DMSTC_TOT     object
F9_09_EXP_GRANT_ORG_DMSTC_TOT       object
F9_09_EXP_INFO_TECH_TOT             object
F9_09_EXP_INSURANCE_TOT             object
F9_09_EXP_INTEREST_TOT              object
F9_09_EXP_JOINT_COSTS_TOT           object
F9_09_EXP_OCCUPANCY_TOT             object
F9_09_EXP_OFFICE_TOT                object
F9_09_EXP_ROY_TOT                   object
F9_09_EXP_SCHED_O_X                  int64
F9_09_EXP_TRAVEL_ENTRTNMNT_TOT      object
F9_09_EXP_TRAVEL_TOT                object
F9_09_PC_COMP_DISQUAL_FUNDRAISE     object
F9_09_PC_COMP_DISQUAL_MGMT          object
F9_09_PC_COMP_DISQUAL_PROG_SVCE     object
F9_09_PC_CO

In [52]:
df[Int64_vars].dtypes[200:250]

F9_09_PC_PENSION_CONT_PROG_SVCE     object
F9_09_PC_PENSION_CONT_TOTAL         object
F9_09_PC_TOTAL_FUNC_EXPENSES        object
F9_09_PC_TOTAL_FUNDRAISE_EXPENSE    object
F9_09_PC_TOTAL_MGMT_EXPENSE         object
F9_09_PC_TOTAL_PROG_SVCE_EXPENSE    object
F9_10_ASSETS_ACC_NET_EOY            object
F9_10_ASSETS_EXP_PREPAID_EOY        object
F9_10_ASSETS_INTANGIB_EOY           object
F9_10_ASSETS_INVENT_SALE_EOY        object
F9_10_ASSETS_LESS_DEPREC_EOY        object
F9_10_ASSETS_LOANS_DISQUAL_EOY      object
F9_10_ASSETS_NOTES_LOANS_NET_EOY    object
F9_10_ASSETS_OTH_EOY                object
F9_10_ASSETS_PLEDGES_NET_EOY        object
F9_10_LIAB_ACC_PAYABLE_EOY          object
F9_10_LIAB_GRANTS_PAYABLE_EOY       object
F9_10_LIAB_LOANS_OFF_EOY            object
F9_10_LIAB_REV_DEFERRED_EOY         object
F9_10_NAFB_RESTRICT_PERM_EOY        object
F9_10_NAFB_RESTRICT_TEMP_EOY        object
F9_10_NAFB_UNRESTRICT_EOY           object
F9_10_PC_BOND_LIABILITY_EOY         object
F9_10_PC_CA

In [54]:
df[Int64_vars].dtypes[250:]

F9_11_SCHED_O_X                      int64
F9_12_PC_ACCNT_COMPILE_OR_REVIEW     int64
F9_12_PC_ACCTG_METHOD_ACCRUAL        int64
F9_12_PC_ACCTG_METHOD_CASH           int64
F9_12_PC_AUDIT_COMMITTEE             int64
F9_12_PC_FED_GRNT_AUDIT_PERFORMD     int64
F9_12_PC_FED_GRNT_AUDIT_REQUIRED     int64
F9_12_PC_FINCL_STMTS_AUDITED         int64
F9_12_SCHED_O_X                      int64
number_of_other_prog_svces          object
dtype: object

In [58]:
# Step 1: Get the variable names expected to be Int64
#Int64_vars = new_variables_df[new_variables_df['python_data_type']=='Int64']['variable_name_new'].tolist()

# Step 2: Filter to keep only those that are currently of dtype 'object' in df
#not_yet_converted = [var for var in Int64_vars if df[var].dtype == 'object']
#print(len(not_yet_converted))

# Optional: turn it into a DataFrame like the original Int64_vars
#not_yet_converted_df = new_variables_df[new_variables_df['variable_name_new'].isin(not_yet_converted)]

Int64_vars = [var for var in Int64_vars if df[var].dtype == 'object']
print(len(Int64_vars))

182


In [59]:
df[Int64_vars].dtypes[:20]

F9_00_HD_EXEMPT_STATUS_501C           object
F9_00_HD_GROSS_RCPT                   object
F9_00_HD_TAX_YEAR                     object
F9_00_HD_YEAR_FORMED                  object
F9_01_PC_BEN_PAID_MEMB_PRIOR          object
F9_01_PC_CONTR_GRANTS_CURR            object
F9_01_PC_CONTR_GRANTS_PRIOR           object
F9_01_PC_GRANTS_PRIOR                 object
F9_01_PC_INDEP_VOTING_MEMB            object
F9_01_PC_INVEST_INCOME_PRIOR          object
F9_01_PC_NET_ASSETS_BOY               object
F9_01_PC_OTHER_EXPENSE_PRIOR          object
F9_01_PC_OTHER_REV_PRIOR              object
F9_01_PC_PROF_FUNDRISING_EXP_CURR     object
F9_01_PC_PROF_FUNDRISING_EXP_PRIOR    object
F9_01_PC_PROG_SERVICE_REV_PRIOR       object
F9_01_PC_REV_LESS_EXP_CURR            object
F9_01_PC_REV_LESS_EXP_PRIOR           object
F9_01_PC_TOT_ASSETS_EOY               object
F9_01_PC_TOT_EXP_PRIOR                object
dtype: object

##### This approach doesn't work:
- df['F9_00_HD_EXEMPT_STATUS_501C'] = df['F9_00_HD_EXEMPT_STATUS_501C'].astype('Int64')

There are problems converting some variables to 'Int64', with the following error message:
    - TypeError: object cannot be converted to an IntegerDtype

Instead, use the following one-liner -- it chooses whether to convert to 'Int64' or 'float'    

In [60]:
len(Int64_vars)

182

<br>I am guessing the pd.to_numeric issue is due to *number_of_other_prog_svces*, which has a value of, for example:

{'Desc': 'THOMAS BRIDGE WATER CORPORATION PROVIDES WATER SERVICE TO APPROXIMATELY 1700 WATER CUSTOMERS WHICH WOULD NOT HAVE A CLEAN AND ADEQUATE WATER SUPPLY IF NOT FOR THE COMPANY.'}


In [61]:
df[Int64_vars].sample(5)

Unnamed: 0,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_GROSS_RCPT,F9_00_HD_TAX_YEAR,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,number_of_other_prog_svces
1804886,,495541,2018,1991,,176181,190309.0,,9,9.0,397514.0,182059.0,54724.0,0,,308122.0,-47148,28197.0,642765,524967.0,39024,53,35.0,292399,553164.0,0,0.0,9,0,0,14,350366,179211,23285,284884,352301,342908.0,665788.0,531512,268274.0,484364,,271195,0.0,472790,,,,,,,,,,,,271195,53,27,9,9,,,0.0,60350.0,0.0,97113.0,,,,11177.0,,22737.0,,,29548.0,,49520.0,,284884,,176181.0,11725.0,284884,484364,2464.0,,,0.0,21339.0,0.0,21339.0,,,,4817.0,8479.0,7692.0,,23122,,,,,,,,,2114.0,19612.0,38624.0,60350.0,22406.0,,,,,,,0.0,16318.0,0.0,16318.0,8842.0,80478.0,158557.0,247877.0,,940.0,7894.0,18922.0,27756.0,,,,,531512,39024,221293,271195,,15732.0,,,535425.0,,,1000.0,,29028.0,,,120575.0,,71456.0,278910.0,,,,,,,,920882.0,385457.0,,,,73462.0,90608.0,142796.0,142796.0,,,,642765,,,,-47148,,
579218,,16559559,2013,1939,0.0,10819355,4675222.0,89280.0,32,11429.0,26238672.0,6051052.0,1623447.0,6612,21586.0,3164374.0,5626894,-1035410.0,32771171,10509882.0,546412,164,250.0,904878,9474472.0,803914,-244072.0,32,0,91927,5977,31866293,5975807,1540157,3393037,4057286,4347964.0,27665115.0,10131632,1426443.0,15758526,713990.0,6734965,91927.0,3301027,,999804.0,,66447.0,,419538.0,,,,,,8154307,164,17,32,32,9.0,4.0,42305.0,711482.0,0.0,9141511.0,559683.0,656829.0,,136871.0,559683.0,135921.0,,,,2168370.0,1118161.0,299898.0,3393037,,10819355.0,29566.0,3393037,15758526,830148.0,0.0,0.0,368.0,16101.0,1811598.0,1828067.0,0.0,91927.0,0.0,117563.0,139937.0,0.0,,1120598,229696.0,0.0,0.0,0.0,,,,0.0,,442955.0,313797.0,756752.0,74920.0,6612.0,0.0,1110.0,0.0,0.0,0.0,11615.0,57695.0,185277.0,254587.0,334155.0,461470.0,1973778.0,2769403.0,0.0,26573.0,67480.0,182491.0,276544.0,,,,0.0,10131632,546412,1430913,8154307,92874.0,48875.0,0.0,246993.0,21359611.0,0.0,0.0,1075430.0,5431059.0,904878.0,0.0,0.0,0.0,5446366.0,3589075.0,22830852.0,0.0,2310351.0,4069963.0,0.0,0.0,0.0,356666.0,44424865.0,23065254.0,0.0,0.0,,160735.0,89700.0,0.0,0.0,0.0,0.0,0.0,32771171,,,,5626894,727.0,
1147209,,120954,2016,2016,,0,,,4,,,,,0,,,16374,,30674,,0,0,,14661,,0,,4,0,0,0,16013,104580,0,120954,0,,,104580,,120954,,104580,,120594,,,,,,,,,,,,104580,0,0,4,4,,,,,,,,,,,,,,,,,,,120954,,,,120954,120954,,,,,,,,,,,,,,,101622,1238.0,,,,,,,,,,,,,,,1720.0,,,,,,,,,,,,,,,,,,,,,104580,0,1238,103342,,,,,,,,,,1229.0,,,13432.0,,,,,,30674.0,,,,,,,,,16013.0,,,,,,,,30674,,,,16374,,
2781862,,22545,2022,2022,,8001,,,4,,,,,0,,,-6103,,-6103,,0,0,12.0,0,,0,,4,0,0,0,-6103,28648,0,14544,0,,,28648,,22545,,28648,,22545,,,,,,,,,,,,28648,0,0,4,4,,,,,,8001.0,,,,,,,,,,,,,14544,,8001.0,,14544,22545,348.0,0.0,6231.0,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,0,767.0,0.0,0.0,1646.0,,,,0.0,,,,0.0,0.0,,0.0,0.0,0.0,0.0,0.0,,,,0.0,,,,0.0,0.0,,,,0.0,,,,0.0,28648,0,0,28648,,,,,,,,,,,,,,,,,,,-6103.0,,,,,,,,,,,,,,,,,-6103,,,,-6103,,
3331381,,24701877,2023,1998,0.0,5821979,6953707.0,0.0,15,24848.0,20798241.0,7433271.0,181935.0,0,0.0,18039194.0,-262647,686131.0,23964503,24513553.0,0,457,15.0,3032245,25199684.0,0,0.0,15,0,0,295839,20932258,8903232,212891,17995518,15685642,17080282.0,24465413.0,24588874,3667172.0,24326227,,11782533,,9854198,,5492953.0,,6130719.0,,2357287.0,,,1540159.0,,2154709.0,21172932,457,80,15,15,3.0,14.0,185032.0,1585168.0,0.0,325000.0,,,,,,,,,5496979.0,,,,17995518,,5821979.0,178355.0,17995518,24326227,,,35503.0,,89962.0,597853.0,687815.0,,,,,264537.0,17237.0,,897912,307412.0,,,188019.0,,,,,,360659.0,281567.0,642226.0,,,,7976.0,,,1690072.0,,141399.0,883315.0,1024714.0,,1306059.0,11026285.0,12332344.0,,,203418.0,1161468.0,1364886.0,,35975.0,285497.0,321472.0,24588874,0,3415942,21172932,2016586.0,284776.0,,,10087229.0,,,495910.0,440649.0,1947500.0,,,156559.0,,,,85097.0,5634359.0,3913875.0,,0.0,,2915171.0,17589092.0,7501863.0,,675041.0,,3079481.0,3810307.0,168048.0,168048.0,,,,23964503,,,,-262647,396664.0,"{""ExpenseAmt"": ""1540159"", ""RevenueAmt"": ""2154709"", ""Desc"": ""CONSUMER DIRECTED PERSONAL CARE SERVICES - PROVIDES SERVICES TO ALLOW MEDICAID ELIGIBLE INDIVIDUALS IN THE COMMUNITY TO TRAIN, HIRE, AND DISCHARGE THEIR OWN PERSONAL CARE ATTENDANTS. 40 ..."


In [62]:
print(len(Int64_vars))
Int64_vars.remove('number_of_other_prog_svces')
print(len(Int64_vars))

182
181


In [57]:
#%%time
#print(len(df), len(df.columns))
#f[Int64_vars] = df[Int64_vars].apply(pd.to_numeric)
#print(len(df), len(df.columns))
#df[:1]

891980 298
891980 298
CPU times: total: 10min 53s
Wall time: 11min 28s


Unnamed: 0,URL,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,BusinessName,BusinessNameControlTxt,PhoneNum,USAddress,InCareOfNm,ForeignAddress,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr
0,https://s3.amazonaws.com/irs-form-990/201812509349300101_public.xml,,2022-09-23 18:48:47+00:00,2018,346526754,{'BusinessNameLine1Txt': 'Lucas County Farm Bureau'},LUCA,4198338015,"{'AddressLine1Txt': '109 Portage St', 'CityNm': 'Woodville', 'StateAbbreviationCd': 'OH', 'ZIPCd': '43469'}",,,,,,,,5.0,,,,272756,0,,,KAYLA RICHARDS,2018-08-29,,OH,2017-08-01,2018-07-31,2017,2018-09-07 04:44:38-07:00,,1.0,,,,,1916.0,239263.0,236036,278582.0,,10,15945.0,383254.0,94080.0,29098.0,0,,,3331,-30456.0,,424855,354081.0,0,7,39.0,38270,323625.0,0,,10,181041,0,10719,386585,IMPROVE RURAL STANDARD OF LIVING.,65279,26001,0,23105,20738.0,424800.0,269425,41546.0,272756,0.0,0.0,,BENEFITS PAID TO OR FOR MEMBERS - THIS IS PAID MEMBERSHIPS TO OHIO FARM BUREAU AND TO AMERICAN FARM BUREAU TO FUTHER THEIR EFFORTS IN PROGRAMMING AND PROMOTING THE FARMING COMMUNITY.,,,,,MEMBERSHIP - COSTS OF PROMOTING FARM BUREAU AND ITS MISSION. PROMOTION OF FARM BUREAU PROGRAMS AND EVENTS IN ORDER TO EDUCATE THE FARMER AND CONSUMER IN CURRENT FARMING AND FOOD ISSUES.,,,,,"CONFERENCE, CONVENTIONS AND MEETINGS - EDUCATION OF VOLUNTEERS FOR THE PROMOTING AND MARKETING OF FARM ISSUES AND CURRENT EVENTS.",,,,,,,,Improve rural standard of living.,,0,0,0,0.0,0,0,0.0,0,0.0,0,0,,0.0,,7,0,0,1.0,1,1.0,1.0,0,1,0,0,0,1,0,1,,1.0,0,,0,0,1,1.0,1,1.0,10,10,0,1.0,,,,OH,1,,0,0,,,,0,,2130.0,,,,,,,,,,,,,,236036.0,,,,236036.0,26001.0,,272756,,181041.0,7018.0,,6770.0,,6770.0,,,,,697.0,,,3478.0,2283.0,42350,,,,331.0,,,,,,,1860.0,1860.0,2352.0,,,,,,,,,,,,21245.0,21245.0,,,,,,,,,,269425,,17563.0,251862.0,6278.0,733.0,,1197.0,109784.0,,,,,11076.0,,,27194.0,,,,,74474.0,72904.0,,,,233959.0,165116.0,55332.0,,,1.0,,386585.0,,,,,,,,424855,,,,,3331,,,0,1.0,,,,,0.0,0,,,0,109 Portage St,,Woodville,43469,,OH,2018


#### Updated way
This concise and efficient version will convert them all safely from 'object' to pandas' nullable 'Int64' type.

✅ Why this works well:
`pd.to_numeric(..., errors='coerce')`: Safely converts strings to numbers, setting invalid entries to `NaN`.

`.astype('Int64')`: Converts to Pandas' nullable integer type (not the NumPy `int64`, which can't store missing values).

In [None]:
#%%time
#print(len(df), len(df.columns))
#df[Int64_vars] = df[Int64_vars].apply(lambda col: pd.to_numeric(col, errors='coerce').astype('Int64'))
#print(len(df), len(df.columns))
#df[:1]

Version  that logs the number of `NaN` values before and after conversion for each column — so you can see how many values were coerced due to invalid data during the conversion:

In [63]:
%%time
print(len(df), len(df.columns))

for col in Int64_vars:
    before_na = df[col].isna().sum()
    df[col] = pd.to_numeric(df[col], errors='coerce').astype('Int64')
    after_na = df[col].isna().sum()
    newly_na = after_na - before_na
    print(f"{col}: {newly_na} newly missing (coerced)")

print(len(df), len(df.columns))
df[:1]

3469008 306
F9_00_HD_EXEMPT_STATUS_501C: 0 newly missing (coerced)
F9_00_HD_GROSS_RCPT: 0 newly missing (coerced)
F9_00_HD_TAX_YEAR: 0 newly missing (coerced)
F9_00_HD_YEAR_FORMED: 0 newly missing (coerced)
F9_01_PC_BEN_PAID_MEMB_PRIOR: 0 newly missing (coerced)
F9_01_PC_CONTR_GRANTS_CURR: 0 newly missing (coerced)
F9_01_PC_CONTR_GRANTS_PRIOR: 0 newly missing (coerced)
F9_01_PC_GRANTS_PRIOR: 0 newly missing (coerced)
F9_01_PC_INDEP_VOTING_MEMB: 0 newly missing (coerced)
F9_01_PC_INVEST_INCOME_PRIOR: 0 newly missing (coerced)
F9_01_PC_NET_ASSETS_BOY: 0 newly missing (coerced)
F9_01_PC_OTHER_EXPENSE_PRIOR: 0 newly missing (coerced)
F9_01_PC_OTHER_REV_PRIOR: 0 newly missing (coerced)
F9_01_PC_PROF_FUNDRISING_EXP_CURR: 0 newly missing (coerced)
F9_01_PC_PROF_FUNDRISING_EXP_PRIOR: 0 newly missing (coerced)
F9_01_PC_PROG_SERVICE_REV_PRIOR: 0 newly missing (coerced)
F9_01_PC_REV_LESS_EXP_CURR: 0 newly missing (coerced)
F9_01_PC_REV_LESS_EXP_PRIOR: 0 newly missing (coerced)
F9_01_PC_TOT_ASSETS

F9_10_LIAB_ACC_PAYABLE_EOY: 0 newly missing (coerced)
F9_10_LIAB_GRANTS_PAYABLE_EOY: 0 newly missing (coerced)
F9_10_LIAB_LOANS_OFF_EOY: 0 newly missing (coerced)
F9_10_LIAB_REV_DEFERRED_EOY: 0 newly missing (coerced)
F9_10_NAFB_RESTRICT_PERM_EOY: 0 newly missing (coerced)
F9_10_NAFB_RESTRICT_TEMP_EOY: 0 newly missing (coerced)
F9_10_NAFB_UNRESTRICT_EOY: 0 newly missing (coerced)
F9_10_PC_BOND_LIABILITY_EOY: 0 newly missing (coerced)
F9_10_PC_CASH_NON_INTEREST_BOY: 0 newly missing (coerced)
F9_10_PC_CASH_NON_INTEREST_EOY: 0 newly missing (coerced)
F9_10_PC_ESCROW_LIABILITY_EOY: 0 newly missing (coerced)
F9_10_PC_INVEST_OTHER_SEC_EOY: 0 newly missing (coerced)
F9_10_PC_INVEST_PROG_RELTD_EOY: 0 newly missing (coerced)
F9_10_PC_INVEST_PUB_TRADED_EOY: 0 newly missing (coerced)
F9_10_PC_LAND_BLDG_EQPMT: 0 newly missing (coerced)
F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN: 0 newly missing (coerced)
F9_10_PC_LOANS_FROM_OFFICERS_EOY: 0 newly missing (coerced)
F9_10_PC_OTHER_LIABILITIES_EOY: 0 newly miss

Unnamed: 0,_id,OrganizationName,URL,DLN,TaxPeriod,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_00_HD_BUILD_TIME_STAMP,fiscal_year,EIN,Name,NameControl,Phone,USAddress,ForeignAddress,InCareOfName,BusinessName,BusinessNameControlTxt,PhoneNum,InCareOfNm,ForeignPhoneNum,F9_00_HD_ADDR_CHANGE,F9_00_HD_AMENDED_RETURN,F9_00_HD_CTRY_OF_DOMICILE,F9_00_HD_EXEMPT_STATUS_4847A1,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_EXEMPT_STATUS_501C3,F9_00_HD_FINAL_RETURN,F9_00_HD_GROSS_EXEMPT_NUM,F9_00_HD_GROSS_RCPT,F9_00_HD_GROUP_RETURN,F9_00_HD_INCLUDES_SUBORD_ORGS,F9_00_HD_INITIAL_RETURN,F9_00_HD_PRIN_OFF_NAME,F9_00_HD_SIGNING_OFFICER_SIGNTR,F9_00_HD_SPECIAL_CONDITION_DESC,F9_00_HD_STATE_OF_DOMICILE,F9_00_HD_TAX_PER_BEGIN,F9_00_HD_TAX_PER_END,F9_00_HD_TAX_YEAR,F9_00_HD_TIME_STAMP,F9_00_HD_TYPE_ORG_ASSOCIATION,F9_00_HD_TYPE_ORG_CORP,F9_00_HD_TYPE_ORG_OTHER,F9_00_HD_TYPE_ORG_OTHER_DESC,F9_00_HD_TYPE_ORG_TRUST,F9_00_HD_WEBSITE,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TERMINATION_CONTRACTION,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_ORGANIZATIONAL_MISSION,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PGMSVC_SIGNIF_CHG,F9_03_PC_PGMSVC_SIGNIF_NEW,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_DESC,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_DESC,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_DESC,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_03_PZ_MISSION_DESCRIPTION,F9_03_PZ_SCHEDULE_O_PART3,F9_04_PC_ACTVITIES_VIA_PARTNER,F9_04_PC_CONTROLLED_ENTITY,F9_04_PC_DISREGARDED_ENTITY,F9_04_PC_EXCESS_BENEFIT_TRANS,F9_04_PC_FR_EVENT_INC_GT_15K,F9_04_PC_GAMING_INC_GT_15K,F9_04_PC_LOBBYING_ACTIVITIES,F9_04_PC_POLITICAL_ACTIVITIES,F9_04_PC_PRIOR_EXCESS_BEN_TRAN,F9_04_PC_PROF_FR_EXP_GT_15K,F9_04_PC_RELATED_ENTITY,F9_04_PC_TRANS_TO_CNTRLD_ENT,F9_04_PC_TRANS_WITH_CNTRLD_ENT,F9_05_EXP_SCHED_O_X,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_05_PC_UNRELATED_BUS_INCOME,F9_06_EXP_SCHED_O_X,F9_06_PC_990_PROVIDED_GOV_BODY,F9_06_PC_ANNUAL_DISC_COVRD_PERS,F9_06_PC_CEO_COMPENSTN_PROCESS,F9_06_PC_CHANGES_ORGANIZING_DOCS,F9_06_PC_CONFLICT_OF_INTEREST,F9_06_PC_DECISIONS_SUBJ_APPROVAL,F9_06_PC_DELEGATION_MGT_DUTIES,F9_06_PC_DELEGATION_OF_MGT,F9_06_PC_DOCUMENT_RET_POLICY,F9_06_PC_ELECTION_BOARD_MEMBERS,F9_06_PC_FAMILY_OR_BUSINESS_REL,F9_06_PC_FORM_AVAIL_OWN_WEBSITE,F9_06_PC_FORM_UPON_REQUEST,F9_06_PC_JOINT_VENTURE_INVESTMNT,F9_06_PC_JOINT_VENTURE_POLICY,F9_06_PC_LOCAL_CHAPTERS,F9_06_PC_MATERIAL_DIVERSION,F9_06_PC_MEMBERS_OR_STOCKHOLDERS,F9_06_PC_MINUTES_COMMITTEES,F9_06_PC_MINUTES_GOVERNING_BODY,F9_06_PC_MONITORING_OF_COI_POLICY,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_06_PC_OFFICER_MAILING_ADDRESS,F9_06_PC_OTHER_COMPENSTN_PROCESS,F9_06_PC_OTHER_WEBSITE,F9_06_PC_OWN_WEBSITE,F9_06_PC_POLICIES_GOVERN_CHAPTER,F9_06_PC_STATES_WHERE_RET_FILED,F9_06_PC_WHISTLEBLOWER_POLICY,F9_07_EXP_SCHED_O_X,F9_07_PC_COMPENSATION_OTHER_SRCE,F9_07_PC_FORMER_OFFICER_LISTED,F9_07_PC_NO_LISTED_PERS_COMPENSD,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOTAL_COMP_GRTR_150K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_EXP_SCHED_O_X,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_OTH_OTH_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_SCHED_O_X,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_ORG_FOLLOWS_SFAS117,F9_10_PC_ORG_NOT_FOLLOW_SFAS117,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_10_SCHED_O_X,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN,F9_11_SCHED_O_X,F9_12_PC_ACCNT_COMPILE_OR_REVIEW,F9_12_PC_ACCTG_METHOD_ACCRUAL,F9_12_PC_ACCTG_METHOD_CASH,F9_12_PC_ACCTG_METHOD_OTHER,F9_12_PC_AUDIT_COMMITTEE,F9_12_PC_FED_GRNT_AUDIT_PERFORMD,F9_12_PC_FED_GRNT_AUDIT_REQUIRED,F9_12_PC_FINCL_STMTS_AUDITED,F9_12_SCHED_O_X,number_of_other_prog_svces,501c3,F9_00_HD_FILER_ADDR_US_L1,F9_00_HD_FILER_ADDR_US_L2,F9_00_HD_FILER_CITY_US,F9_00_HD_FILER_ZIP_US,F9_00_HD_FILER_COUNTRY_FRGN,F9_00_HD_FILER_STATE_US,F9_00_HD_TIME_STAMP_yr
0,5d019e6778ffca27b42818d7,RONALD MCDONALD HOUSE CHARITIES- PHILADELPHIA REGION INC,https://s3.amazonaws.com/irs-form-990/201113139349301301_public.xml,93493313013011,201012,,2016-02-24 21:20:13+00:00,,232705170,"{'BusinessNameLine1': 'RONALD MCDONALD HOUSE CHARITIES-', 'BusinessNameLine2': 'PHILADELPHIA REGION INC'}",RONA,8565826843,"{'AddressLine1': '1525 VALLEY CENTER PARKWAY NO 300', 'AddressLine1Txt': None, 'AddressLine2': None, 'AddressLine2Txt': None, 'City': 'BETHLEHEM', 'CityNm': None, 'State': 'PA', 'StateAbbreviationCd': None, 'ZIPCd': None, 'ZIPCode': '18017'}",,,,,,,,1,0,,0,,1,0,,1473903,0,0,0,MICHAEL ANTON,2011-11-04,,PA,2010-01-01,2010-12-31,2010,2011-11-09 12:41:09+00:00,0,1,0,,0,,1992,0,1439340,1044925,638637,10,30447,1753405,243131,0,0,0,0,89152,193604,0,2440859,881768,195892,0,0,450430,1075372,0,0,10,0,925000,33563,1990429,MAKES GRANTS TO NON-PROFITS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN.,459751,1000,0,0,0,1925215,1384751,171810,1473903,0,0,,"RMHC OF THE PHILADELPHIA REGION, INC. GRANTS HUNDREDS OF THOUSANDS OF DOLLARS PER YEAR TO SUPPORT NON-PROFIT PROGRAMS THAT DIRECTLY IMPROVE THE HEALTH AND WELL-BEING OF CHILDREN. LOCALLY, RMHC SUPPORTS THE PHILADELPHIA, SOUTHERN NEW JERSEY AND DE...",1043744,925000,,,,,,,,,,,,,,,1043744,"THE CORPORATION IS ORGANIZED AND WILL BE OPERATED EXCLUSIVELY FOR CHARITABLE, EDUCATIONAL AND SCIENTIFIC PURPOSES WITHIN THE MEANING OF SECTION 501(C)(3) OF THE INTERNAL REVENUE CODE. SUCH PURPOSES SHALL BE LIMITED TO PROVIDING SUPPORT AND FUNDIN...",1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,1,10,10,0,0,0,0,0,"[""PA"", ""NJ"", ""DE""]",0,0,0,0,1,0,0,0,0,0,0,0,1439340,,,,,,,,,,,,,,,1439340,1000,,1473903,,,,,86228,,86228,,33000,892000,,,,,,123,763,,0,,,,,,,,,,,21675,,215,,,,,,,,,,,,118744,,,,,,,,,1384751,195892,145115,1043744,147981,,,,170617,,,,,44353,166000,,,,,1990429,,,,,1851561,,,256845,86228,,1,0,240077,,332660,270700,,,,,,2440859,0,,,,89152,,1,0,1,0,,1,0,0,1,1,,1,1525 VALLEY CENTER PARKWAY NO 300,,BETHLEHEM,18017,,PA,2011


In [64]:
%%time
df[Int64_vars].dtypes[:25]

CPU times: total: 2.41 s
Wall time: 2.63 s


F9_00_HD_EXEMPT_STATUS_501C           Int64
F9_00_HD_GROSS_RCPT                   Int64
F9_00_HD_TAX_YEAR                     Int64
F9_00_HD_YEAR_FORMED                  Int64
F9_01_PC_BEN_PAID_MEMB_PRIOR          Int64
F9_01_PC_CONTR_GRANTS_CURR            Int64
F9_01_PC_CONTR_GRANTS_PRIOR           Int64
F9_01_PC_GRANTS_PRIOR                 Int64
F9_01_PC_INDEP_VOTING_MEMB            Int64
F9_01_PC_INVEST_INCOME_PRIOR          Int64
F9_01_PC_NET_ASSETS_BOY               Int64
F9_01_PC_OTHER_EXPENSE_PRIOR          Int64
F9_01_PC_OTHER_REV_PRIOR              Int64
F9_01_PC_PROF_FUNDRISING_EXP_CURR     Int64
F9_01_PC_PROF_FUNDRISING_EXP_PRIOR    Int64
F9_01_PC_PROG_SERVICE_REV_PRIOR       Int64
F9_01_PC_REV_LESS_EXP_CURR            Int64
F9_01_PC_REV_LESS_EXP_PRIOR           Int64
F9_01_PC_TOT_ASSETS_EOY               Int64
F9_01_PC_TOT_EXP_PRIOR                Int64
F9_01_PC_TOT_FNDR_EXP_CURR            Int64
F9_01_PC_TOT_INDIV_EMPLOYED           Int64
F9_01_PC_TOT_INDIV_VOLUNTEERS   

#### Check `dtypes`
Quickly check which variables in Int64_vars are not yet of type 'Int64' and count them using:

In [67]:
%%time
# Count how many are not of dtype 'Int64'
not_int64 = df[Int64_vars].dtypes != 'Int64'
not_int64.sum()

CPU times: total: 2.45 s
Wall time: 2.59 s


0

✅ Want to see which ones specifically?

In [68]:
%%time
df[Int64_vars].dtypes[not_int64]

CPU times: total: 2.41 s
Wall time: 2.54 s


Series([], dtype: object)

✅ Alternatively, in one line:

In [69]:
%%time
df[Int64_vars].dtypes[df[Int64_vars].dtypes != 'Int64']

CPU times: total: 4.89 s
Wall time: 5.17 s


Series([], dtype: object)

In [70]:
df[Int64_vars].sample(10)

Unnamed: 0,F9_00_HD_EXEMPT_STATUS_501C,F9_00_HD_GROSS_RCPT,F9_00_HD_TAX_YEAR,F9_00_HD_YEAR_FORMED,F9_01_PC_BEN_PAID_MEMB_PRIOR,F9_01_PC_CONTR_GRANTS_CURR,F9_01_PC_CONTR_GRANTS_PRIOR,F9_01_PC_GRANTS_PRIOR,F9_01_PC_INDEP_VOTING_MEMB,F9_01_PC_INVEST_INCOME_PRIOR,F9_01_PC_NET_ASSETS_BOY,F9_01_PC_OTHER_EXPENSE_PRIOR,F9_01_PC_OTHER_REV_PRIOR,F9_01_PC_PROF_FUNDRISING_EXP_CURR,F9_01_PC_PROF_FUNDRISING_EXP_PRIOR,F9_01_PC_PROG_SERVICE_REV_PRIOR,F9_01_PC_REV_LESS_EXP_CURR,F9_01_PC_REV_LESS_EXP_PRIOR,F9_01_PC_TOT_ASSETS_EOY,F9_01_PC_TOT_EXP_PRIOR,F9_01_PC_TOT_FNDR_EXP_CURR,F9_01_PC_TOT_INDIV_EMPLOYED,F9_01_PC_TOT_INDIV_VOLUNTEERS,F9_01_PC_TOT_LIABILITIES_EOY,F9_01_PC_TOT_REVENUE_PRIOR,F9_01_PC_TOT_UBI_GROSS,F9_01_PC_TOT_UBI_NET,F9_01_PC_VOTING_MEMB_GOV_BODY,F9_01_PZ_BEN_PAID_TO_MEMB_CURR,F9_01_PZ_GRANTS_PAID_CURR,F9_01_PZ_INVEST_INCOME_CURR,F9_01_PZ_NAFB_EOY,F9_01_PZ_OTHER_EXPENSE_CURR,F9_01_PZ_OTHER_REV_CURR,F9_01_PZ_PROG_SERVICE_REV_CURR,F9_01_PZ_SALARIES_CURR,F9_01_PZ_SALARIES_PRIOR,F9_01_PZ_TOT_ASSETS_BOY,F9_01_PZ_TOT_EXP_CURR,F9_01_PZ_TOT_LIAB_BOY,F9_01_PZ_TOT_REV_CURR,F9_03_PC_PROG_SVC_ACC_1_CODE,F9_03_PC_PROG_SVC_ACC_1_EXP,F9_03_PC_PROG_SVC_ACC_1_GRNT,F9_03_PC_PROG_SVC_ACC_1_REV,F9_03_PC_PROG_SVC_ACC_2_CODE,F9_03_PC_PROG_SVC_ACC_2_EXP,F9_03_PC_PROG_SVC_ACC_2_GRNT,F9_03_PC_PROG_SVC_ACC_2_REV,F9_03_PC_PROG_SVC_ACC_3_CODE,F9_03_PC_PROG_SVC_ACC_3_EXP,F9_03_PC_PROG_SVC_ACC_3_GRNT,F9_03_PC_PROG_SVC_ACC_3_REV,F9_03_PC_TOT_OTH_PROG_SVC_EXP,F9_03_PC_TOT_OTH_PROG_SVC_GRNT,F9_03_PC_TOT_OTH_PROG_SVC_REV,F9_03_PC_TOT_PROG_SVC_EXPENSE,F9_05_PC_NUMBER_EMPLOYEES_W3,F9_05_PC_NUMBER_FORMS_1096,F9_06_PC_NUM_IND_VOTING_MEMBERS,F9_06_PC_NUM_VOTING_GOV_MEMBERS,F9_07_PC_NUM_CONTRCTRS_GRTR_100K,F9_07_PC_NUM_INDS_GREATER_100K,F9_07_PC_TOT_OTHER_COMPENSATION,F9_07_PC_TOT_REPRT_COMP_FROM_ORG,F9_07_PC_TOT_REPRT_COMP_RLTD_ORG,F9_08_PC_ALL_OTHER_CONTRIBUTIONS,F9_08_PC_CONTS_REPRTD_FNDRAISNG,F9_08_PC_COST_OF_GOODS_SOLD,F9_08_PC_FEDERATED_CAMPAIGNS,F9_08_PC_FUNDRAISING_DIRECT_EXP,F9_08_PC_FUNDRAISING_EVENTS,F9_08_PC_FUNDRAISING_GROSS_INC,F9_08_PC_GAMING_DIRECT_EXPENSES,F9_08_PC_GAMING_GROSS_INCOME,F9_08_PC_GOVERNMENT_GRANTS,F9_08_PC_GROSS_SALES_INVENTORY,F9_08_PC_MEMBERSHIP_DUES,F9_08_PC_NONCASH_CONTRIBUTIONS,F9_08_PC_PROGRAM_SVCE_REV_TOTAL,F9_08_PC_RELATED_ORGANIZATIONS,F9_08_PC_TOTAL_CONTRIBUTIONS,F9_08_PC_TOTAL_OTHER_REVENUE,F9_08_PC_TOTAL_PROG_SVCE_REVENUE,F9_08_PC_TOTAL_REVENUE,F9_09_EXP_AD_PROMO_TOT,F9_09_EXP_BENF_PAID_MEMB_TOT,F9_09_EXP_CONF_MEETING_TOT,F9_09_EXP_DEPREC_FUNDR,F9_09_EXP_DEPREC_MAG,F9_09_EXP_DEPREC_PROG,F9_09_EXP_DEPREC_TOT,F9_09_EXP_GRANT_FRGN_TOT,F9_09_EXP_GRANT_INDIV_DMSTC_TOT,F9_09_EXP_GRANT_ORG_DMSTC_TOT,F9_09_EXP_INFO_TECH_TOT,F9_09_EXP_INSURANCE_TOT,F9_09_EXP_INTEREST_TOT,F9_09_EXP_JOINT_COSTS_TOT,F9_09_EXP_OCCUPANCY_TOT,F9_09_EXP_OFFICE_TOT,F9_09_EXP_ROY_TOT,F9_09_EXP_TRAVEL_ENTRTNMNT_TOT,F9_09_EXP_TRAVEL_TOT,F9_09_PC_COMP_DISQUAL_FUNDRAISE,F9_09_PC_COMP_DISQUAL_MGMT,F9_09_PC_COMP_DISQUAL_PROG_SVCE,F9_09_PC_COMP_DISQUAL_TOTAL,F9_09_PC_COMP_OFFICERS_FUNDRAISE,F9_09_PC_COMP_OFFICERS_MGMT,F9_09_PC_COMP_OFFICERS_PROG_SVCE,F9_09_PC_COMP_OFFICERS_TOTAL,F9_09_PC_FEES_FOR_SVCE_ACCT_TOT,F9_09_PC_FEES_FOR_SVCE_FR_TOT,F9_09_PC_FEES_FOR_SVCE_INVST_TOT,F9_09_PC_FEES_FOR_SVCE_LEGL_TOT,F9_09_PC_FEES_FOR_SVCE_LOBB_TOT,F9_09_PC_FEES_FOR_SVCE_MGMT_TOT,F9_09_PC_FEES_FOR_SVCE_OTH_TOT,F9_09_PC_OTHER_EMP_BEN_FUNDRAISE,F9_09_PC_OTHER_EMP_BEN_MGMT,F9_09_PC_OTHER_EMP_BEN_PROG_SVCE,F9_09_PC_OTHER_EMP_BEN_TOTAL,F9_09_PC_OTHER_SALARY_FUNDRAISE,F9_09_PC_OTHER_SALARY_MGMT,F9_09_PC_OTHER_SALARY_PROG_SVCE,F9_09_PC_OTHER_SALARY_TOTAL,F9_09_PC_PAYMENT_TO_AFFILIATES,F9_09_PC_PAYROLL_TAX_FUNDRAISE,F9_09_PC_PAYROLL_TAX_MGMT,F9_09_PC_PAYROLL_TAX_PROG_SVCE,F9_09_PC_PAYROLL_TAX_TOTAL,F9_09_PC_PENSION_CONT_FUNDRAISE,F9_09_PC_PENSION_CONT_MGMT,F9_09_PC_PENSION_CONT_PROG_SVCE,F9_09_PC_PENSION_CONT_TOTAL,F9_09_PC_TOTAL_FUNC_EXPENSES,F9_09_PC_TOTAL_FUNDRAISE_EXPENSE,F9_09_PC_TOTAL_MGMT_EXPENSE,F9_09_PC_TOTAL_PROG_SVCE_EXPENSE,F9_10_ASSETS_ACC_NET_EOY,F9_10_ASSETS_EXP_PREPAID_EOY,F9_10_ASSETS_INTANGIB_EOY,F9_10_ASSETS_INVENT_SALE_EOY,F9_10_ASSETS_LESS_DEPREC_EOY,F9_10_ASSETS_LOANS_DISQUAL_EOY,F9_10_ASSETS_NOTES_LOANS_NET_EOY,F9_10_ASSETS_OTH_EOY,F9_10_ASSETS_PLEDGES_NET_EOY,F9_10_LIAB_ACC_PAYABLE_EOY,F9_10_LIAB_GRANTS_PAYABLE_EOY,F9_10_LIAB_LOANS_OFF_EOY,F9_10_LIAB_REV_DEFERRED_EOY,F9_10_NAFB_RESTRICT_PERM_EOY,F9_10_NAFB_RESTRICT_TEMP_EOY,F9_10_NAFB_UNRESTRICT_EOY,F9_10_PC_BOND_LIABILITY_EOY,F9_10_PC_CASH_NON_INTEREST_BOY,F9_10_PC_CASH_NON_INTEREST_EOY,F9_10_PC_ESCROW_LIABILITY_EOY,F9_10_PC_INVEST_OTHER_SEC_EOY,F9_10_PC_INVEST_PROG_RELTD_EOY,F9_10_PC_INVEST_PUB_TRADED_EOY,F9_10_PC_LAND_BLDG_EQPMT,F9_10_PC_LAND_BLDG_EQPMT_DEPRCTN,F9_10_PC_LOANS_FROM_OFFICERS_EOY,F9_10_PC_OTHER_LIABILITIES_EOY,F9_10_PC_RET_EARNINGS_ENDWMT_EOY,F9_10_PC_SAVINGS_TEMP_INVEST_BOY,F9_10_PC_SAVINGS_TEMP_INVEST_EOY,F9_10_PC_SECURED_MORTGAGES_EOY,F9_10_PC_SECURE_MORT_NOTES_EOY,F9_10_PC_UNSECURED_LOANS_EOY,F9_10_PC_UNSECURED_NOTES_BOY,F9_10_PC_UNSECURED_NOTES_EOY,F9_10_PZ_TOTAL_ASSETS_EOY,F9_11_PC_RECNCLTN_DONATED_SVCES,F9_11_PC_RECNCLTN_INVSTMNT_EXP,F9_11_PC_RECNCLTN_PRIOR_PER_ADJ,F9_11_PC_RECNCLTN_REV_LESS_EXP,F9_11_PC_RECNCLTN_UNRLZD_GAIN
2773655,,2575470,2021,1994,0.0,2537409,2205843.0,1253014.0,6,13024.0,1223121,136151,0.0,0,0.0,0.0,545617,300110,1824175,1918757,103105,14,40.0,55437,2218867,0,0.0,7,0,1149094,-515,1768738,168497,0,1280,674966,529592.0,1264329,1992557,41208,2538174,,1003928,1003928.0,,,581709.0,,1280.0,,120481.0,120481.0,,24685.0,24685.0,,1730803,14,1,6,7,0.0,0.0,67771.0,158136.0,0.0,2537409.0,,,,,,,,,,,,46710.0,1280.0,,2537409.0,,1280.0,2538174,46798.0,,7632.0,,,,,1149094.0,,,31399.0,9630.0,,,5314.0,37705.0,,,12855.0,,,,,45752.0,14927.0,190109.0,250788.0,,,,16364.0,,,800.0,3118.0,4558.0,21304.0,28980.0,21366.0,79704.0,254000.0,355070.0,,2708.0,7334.0,30086.0,40128.0,,,,,1992557,103105,158649,1730803,,3260.0,,,0.0,,,,,55437.0,,,,,,,,484828,1035961,,,,34544.0,25212.0,25212.0,,,,750129.0,750410.0,,,,,,1824175,,,,545617,
994163,,193408,2015,1994,,5882,1848.0,,18,,390939,63921,11474.0,0,,151537.0,3513,-1487,637763,166346,0,4,,243311,164859,0,,18,0,0,0,394452,70520,5050,182476,119375,102425.0,638155,189895,247216,193408,,9239,,37168.0,,111969.0,,145308.0,,,,,,,,121208,4,1,18,18,,,,36750.0,,5882.0,,,,,,,,,,,,,182476.0,,5882.0,5050.0,182476.0,193408,254.0,,688.0,,5718.0,1429.0,7147.0,,,,,11087.0,9498.0,,12805.0,6135.0,,,251.0,,,,,,,,,9582.0,,,382.0,,,1174.0,,456.0,,456.0,,40166.0,70580.0,110746.0,250.0,,3076.0,5097.0,8173.0,,,,,189895,0,68687,121208,,,,,214334.0,,276848.0,61055.0,,2849.0,,,,,,394452.0,,94266,85526,13757.0,,,,282843.0,68509.0,,,,,,226705.0,226705.0,,,,637763,,,,3513,
2912144,,2023045,2021,1982,,1838114,1269484.0,,10,748.0,192200,268533,,0,,70292.0,102651,-37123,805663,1377647,8911,23,,510812,1340524,0,,10,0,0,578,294851,497002,0,184353,1423392,1109114.0,658367,1920394,466167,2023045,,1742553,,,,,,,,,,,,,,1742553,23,3,10,10,,1.0,,116666.0,,13533.0,,,,,,,,,1824581.0,,,,184353.0,,1838114.0,,184353.0,2023045,8911.0,,1225.0,,307.0,544.0,851.0,,,,,12616.0,,,113828.0,1703.0,,,38904.0,,,,,,2917.0,113749.0,116666.0,40575.0,,,,,,,,5376.0,209655.0,215031.0,,25136.0,980319.0,1005455.0,,,2156.0,84084.0,86240.0,,,,,1920394,8911,168930,1742553,107497.0,12946.0,,,9406.0,,,80157.0,,46050.0,,,324030.0,,,,,411107,585458,,,,,201965.0,192559.0,,140732.0,,90169.0,10199.0,,,,,,805663,,,,102651,
1468488,,342591,2017,2004,,317368,35395.0,,4,,71262,323773,,0,,317045.0,23609,28667,182374,323773,0,0,,87503,352440,0,,4,0,0,0,94871,318982,0,25223,0,,138262,318982,67000,342591,,62773,,,,245135.0,,,,,,,,,,307908,0,0,4,4,,,,,,317368.0,,,,,,,,,,,,,25223.0,,317368.0,,25223.0,342591,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,318982,0,0,318982,4325.0,4092.0,,,,,,,,87503.0,,,,,,94871.0,,18309,173957,,,,,,,,,,,,,,,,,182374,,,,23609,
113425,,225481,2011,1924,0.0,78863,74852.0,0.0,11,10351.0,108319,119119,42560.0,0,0.0,0.0,-41466,8644,66853,119119,3249,0,35.0,0,127763,0,0.0,11,0,0,10208,66853,177036,46499,0,0,0.0,108319,177036,0,135570,,165970,,,,,,,,,,,,,,165970,0,1,11,11,0.0,0.0,0.0,0.0,0.0,552.0,17357.0,,,27511.0,17357.0,47152.0,62400.0,76102.0,60206.0,,748.0,,,,78863.0,13156.0,,135570,487.0,,,,,,,,,,,2533.0,,,20151.0,6264.0,,,,,,,,,,,,3500.0,,,,,,,,,,,,,,,,,,,,,,,,177036,3249,7817,165970,,,,,,,,,,,,,,,,66853.0,,56485,49745,,,,,,,,,,51834.0,17108.0,,,,,,66853,,,,-41466,
559197,,212163,2013,2007,,0,,,11,67.0,4916947,120213,,0,,42483.0,-178045,-82040,6059687,124590,0,0,,1040312,42550,0,,13,0,0,72,5019375,335461,0,212091,54747,4377.0,6230384,390208,1313437,212163,,366204,,212163.0,,,,,,,,,,,,366204,0,0,11,13,,,600405.0,,2131252.0,,,,,,,,,,,,,,212091.0,,,,212091.0,212163,114.0,,,,,190717.0,190717.0,,,,,412.0,,,84767.0,11486.0,,,,,,,,,,,,8015.0,,,,,15989.0,,,,12033.0,12033.0,,,39646.0,39646.0,,,,3068.0,3068.0,,,,,390208,0,24004,366204,98452.0,,,,5857498.0,,,101503.0,,107597.0,,,956.0,,,5019375.0,,94988,2234,,,,,6131687.0,274189.0,,11410.0,,,,920349.0,920349.0,,,,6059687,,,,-178045,
115812,5.0,422673,2011,1949,,416034,402073.0,,9,489.0,136146,140162,519.0,0,,,-9867,5341,132077,397740,0,11,0.0,5502,403081,0,,9,0,0,431,126575,179744,6208,0,252796,257578.0,143385,432540,7239,422673,,432540,,422673.0,,,,,,,,,,,,432540,11,0,9,9,,,43009.0,102305.0,,,,,,,,,,,,,416034.0,,,,416034.0,6208.0,,422673,,,7985.0,0.0,0.0,261.0,261.0,,,,,4974.0,,,14345.0,10547.0,,,24899.0,,,,,0.0,0.0,146763.0,146763.0,4043.0,,,23644.0,,,,0.0,0.0,14450.0,14450.0,0.0,0.0,60367.0,60367.0,81014.0,0.0,0.0,13757.0,13757.0,0.0,0.0,17459.0,17459.0,432540,0,0,432540,,,,,966.0,,,1796.0,,5502.0,,,,,,126575.0,,106839,91901,,,,22002.0,29171.0,28205.0,,,,11817.0,15412.0,,,,,,132077,,,,-9867,
2686319,,1771431,2021,1977,,0,,880387.0,6,1078496.0,11756287,5539,,0,,,1284462,192570,13539750,885926,0,0,0.0,489000,1078496,0,0.0,8,0,485449,1771431,13050750,1520,0,0,0,,12245287,486969,489000,1771431,,398453,398453.0,0.0,,25500.0,25500.0,0.0,,25000.0,25000.0,0.0,36496.0,36496.0,0.0,485449,0,0,6,8,,,,,,,,,,,,,,,,,,,,,,,,1771431,,,,,,,,,10000.0,475449.0,,,,,,45.0,,,,,,,,,,,,1400.0,,,,,,,,,,,,,,,,,,,,,,,,486969,0,1520,485449,,,,,,,,,,,489000.0,,,,,,,195898,339478,,,,13200272.0,,,,,,,,,,,,,13539750,,,,1284462,
1683823,,136283488,2017,1986,0.0,319740,142766.0,0.0,10,770575.0,36889470,60776711,395985.0,0,0.0,125475762.0,11948768,9545548,109997622,117239540,0,1052,75.0,55860157,126785088,0,0.0,13,0,0,1846838,54137465,63966108,564924,130141505,56958131,56462829.0,95242678,120924239,58353208,132873007,,95034936,0.0,130141505.0,,,,,,,,,,,,95034936,1052,116,10,13,7.0,16.0,247771.0,1327392.0,1774558.0,79740.0,,,,,,,,,,,,,130141505.0,240000.0,319740.0,751400.0,130141505.0,132873007,194883.0,,237170.0,,2797634.0,1449475.0,4247109.0,,,,122105.0,328369.0,1247168.0,,1625777.0,20712214.0,,,64354.0,,,,,,613528.0,,613528.0,53665.0,,46023.0,50638.0,3481.0,,23503310.0,,1445927.0,5126468.0,6572395.0,,5220103.0,40603495.0,45823598.0,,,307261.0,2818770.0,3126031.0,,180967.0,641612.0,822579.0,120924239,0,25889303,95034936,8432691.0,170968.0,,3297971.0,17922265.0,,,3797026.0,,13783845.0,,,,,34006.0,54103459.0,34834936.0,2523,2523,,,,35651903.0,98178395.0,80256130.0,,7241376.0,,31106493.0,40722275.0,,,,,,109997622,,,,11948768,1412160.0
2874889,,7145858,2022,1886,,2639619,3225429.0,,15,189674.0,13216347,1802764,88281.0,0,,3509980.0,1320723,1952903,15151348,5060461,178148,317,109.0,1212087,7013364,0,0.0,15,0,0,123450,13939261,2043245,103160,4197500,3699761,3257697.0,15014554,5743006,1798207,7063729,,3056957,,2770148.0,,1721175.0,,1427352.0,,,,,,,,4778132,317,22,15,15,0.0,2.0,54128.0,356310.0,0.0,583309.0,23790.0,,44132.0,1066.0,23790.0,0.0,,,1988388.0,,0.0,,4197500.0,0.0,2639619.0,0.0,4197500.0,7063729,32533.0,,26500.0,7500.0,29592.0,389230.0,426322.0,0.0,,,125004.0,145443.0,10309.0,,397371.0,14986.0,,,3925.0,,,,,19970.0,74175.0,191143.0,285288.0,53493.0,,,,,,193608.0,4951.0,20650.0,113302.0,138903.0,104365.0,305213.0,2464255.0,2873833.0,97527.0,10428.0,30601.0,238168.0,279197.0,9015.0,30086.0,83439.0,122540.0,5743006,178148,786726,4778132,204845.0,73967.0,,,5101448.0,0.0,,0.0,,296971.0,,0.0,528349.0,,,,0.0,3466762,3973305,58991.0,,,3993389.0,13457654.0,8356206.0,0.0,135937.0,,1503763.0,1804394.0,191839.0,191839.0,0.0,0.0,0.0,15151348,,,,1320723,-597809.0


In [73]:
gc.collect()

710

In [74]:
%%time
df.describe().T

CPU times: total: 43 s
Wall time: 44.6 s


Unnamed: 0,count,mean,std,min,25%,50%,75%,max
F9_09_PC_FEES_FOR_SVCE_FR_TOT,759063.0,19424.472779,304953.261139,-35000.0,0.0,0.0,0.0,97665783.0
F9_00_HD_ADDR_CHANGE,3469008.0,0.039835,0.195571,0.0,0.0,0.0,0.0,1.0
F9_00_HD_AMENDED_RETURN,3469008.0,0.01192,0.108527,0.0,0.0,0.0,0.0,1.0
F9_00_HD_EXEMPT_STATUS_4847A1,3469008.0,0.000709,0.026626,0.0,0.0,0.0,0.0,1.0
F9_00_HD_EXEMPT_STATUS_501C,857105.0,7.263418,4.007911,2.0,5.0,6.0,8.0,29.0
...,...,...,...,...,...,...,...,...
F9_12_PC_FED_GRNT_AUDIT_PERFORMD,3469008.0,0.085901,0.280217,0.0,0.0,0.0,0.0,1.0
F9_12_PC_FED_GRNT_AUDIT_REQUIRED,3469008.0,0.086664,0.281342,0.0,0.0,0.0,0.0,1.0
F9_12_PC_FINCL_STMTS_AUDITED,3469008.0,0.434902,0.495744,0.0,0.0,0.0,1.0,1.0
F9_12_SCHED_O_X,3469008.0,0.165592,0.371714,0.0,0.0,0.0,0.0,1.0


#### Save DF

In [75]:
len(df)

3469008

In [76]:
def prepare_for_save(df):
    import gc

    # Drop any cached views
    df = df.copy()  # Break reference to any partial evaluation from .head(), etc.

    # Optionally sort or reset if needed
    # df = df.sort_values("some_column")  # Only if relevant
    # df = df.reset_index(drop=True)

    # Trigger garbage collection
    gc.collect()

    print("🧼 DataFrame copied + garbage collected. Ready to save.")
    return df

In [None]:
# Step 2: Clean up df (especially if you’ve been doing .head(), .sort(), etc.)
#df = prepare_for_save(df)

In [77]:
%%time
import datetime
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
df.to_feather('D:/all_filings_april_2025_all_controls_combined_parsed_type.feather')

Current date and time :  2025-04-18 21:47:54 

CPU times: total: 1min 7s
Wall time: 45.5 s


In [78]:
%%time
import datetime
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
df.to_parquet("D:/all_filings_april_2025_all_controls_combined_parsed_type.parquet", engine="pyarrow", compression="snappy", index=False)

Current date and time :  2025-04-18 21:49:41 

CPU times: total: 1min 54s
Wall time: 1min 56s


In [79]:
%%time
import datetime
print ("Current date and time : ", datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), '\n')
df.to_pickle('D:/all_filings_april_2025_all_controls_combined_parsed_type.pkl.gz', compression='gzip')

Current date and time :  2025-04-18 21:53:19 

CPU times: total: 1h 31min 9s
Wall time: 1h 34min 23s
