# 1. Introduction

This document aims to perform an initial exploration of the features and values within the Form 990 to identify relevant variables for a detailed analysis of philanthropic giving within the environmental and social justice sectors. Additionally, we intend to assess the level of transparency and accountability of various organizations, leveraging the data provided by the form 990 to inform our inquiry.

Form 990 is a tax form that the United States Internal Revenue Service (IRS) requires tax-exempt organizations to file annually. The form provides the IRS and the public with financial information about the nonprofit organization, and it is used to assess the organization's compliance with tax laws. There are different types of this form. Form 990 is the standard form for organizations with gross receipts of over $200,000 or total assets over $500,000. Form 990ez is a shorter version of the form for organizations with gross receipts less than $200,000 and total assets less than $500,000 Futher information can be found on IRS.gov - https://www.irs.gov/statistics/soi-tax-stats-annual-extract-of-tax-exempt-organization-financial-data


# 2. First Glance

## 2.1. General Summary

In [1]:
# Libraries for data manipulation.
import pandas as pd
import numpy as np

# Libraries for data visualisation.
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.io as pio

# Libraries for quarto rending
from IPython.display import Markdown,display
from tabulate import tabulate
import plotly.io as pio

# Remove warnings.
import warnings
warnings.filterwarnings("ignore", category=UserWarning)

# Read in data.
form_990_2022 = pd.read_csv('../../data/22eoextract990.csv')

# Print data dimensions.
shape_caption = "Data Dimensions:"
shape_df = pd.DataFrame({
        'Dimension': ['Rows','Columns'],
        'Count': [form_990_2022.shape[0], form_990_2022.shape[1]]
    })
shape_df['Count'] = shape_df['Count'].apply(lambda x: f"{x:,}")
shape_markdown = shape_caption + "\n\n" + shape_df.to_markdown(index=False)
display(Markdown(shape_markdown))

# Print a sample of the data.
first_five_rows_caption = "First Five Rows of Data:"
first_five_rows_markdown = first_five_rows_caption + "\n\n" + form_990_2022.head().to_markdown(index=False)
display(Markdown(first_five_rows_markdown))

# Print metadata.
metadata_caption = "Metadata:"
column_metadata = []

for col in form_990_2022.columns:
    # Gather metadata for each col.
    col_metadata = {
        'Column Name': col,
        'Data Type': str(form_990_2022[col].dtype),
        'Unique Values': form_990_2022[col].nunique(),
        'Missing Values': form_990_2022[col].isnull().sum()
    }
    # Append metadata to list.
    column_metadata.append(col_metadata)

# Convert list to pd df and then markdown table.
metadata_df = pd.DataFrame(column_metadata)
metadata_df['Unique Values'] = metadata_df['Unique Values'].apply(lambda x: f"{x:,}")
metadata_df['Missing Values'] = metadata_df['Missing Values'].apply(lambda x: f"{x:,}")
metadata_markdown = metadata_caption + "\n\n" + metadata_df.to_markdown(index=False)
display(Markdown(metadata_markdown))

Data Dimensions:

| Dimension   | Count   |
|:------------|:--------|
| Rows        | 326,123 |
| Columns     | 246     |

First Five Rows of Data:

| efile   |         EIN |   tax_pd |   subseccd | s501c3or4947a1cd   | schdbind   | politicalactvtscd   |   lbbyingactvtscd | subjto6033cd   | dnradvisedfundscd   | prptyintrcvdcd   | maintwrkofartcd   | crcounselingqstncd   | hldassetsintermpermcd   | rptlndbldgeqptcd   | rptinvstothsecd   | rptinvstprgrelcd   | rptothasstcd   | rptothliabcd   | sepcnsldtfinstmtcd   | sepindaudfinstmtcd   | inclinfinstmtcd   | operateschools170cd   | frgnofficecd   | frgnrevexpnscd   | frgngrntscd   | frgnaggragrntscd   | rptprofndrsngfeescd   | rptincfnndrsngcd   | rptincgamingcd   | operatehosptlcd   |   hospaudfinstmtcd | rptgrntstogovtcd   | rptgrntstoindvcd   | rptyestocompnstncd   | txexmptbndcd   |   invstproceedscd |   maintescrwaccntcd |   actonbehalfcd |   engageexcessbnftcd |   awarexcessbnftcd | loantofficercd   | grantoofficercd   | dirbusnreltdcd   | fmlybusnreltdcd   | servasofficercd   | recvnoncashcd   | recvartcd   | ceaseoperationscd   | sellorexchcd   | ownsepentcd   | reltdorgcd   | intincntrlcd   |   orgtrnsfrcd | conduct5percentcd   | compltschocd   |   f1096cnt |   fw2gcnt | wthldngrulescd   |   noemplyeesw3cnt | filerqrdrtnscd   | unrelbusinccd   | filedf990tcd   | frgnacctcd   | prohibtdtxshltrcd   | prtynotifyorgcd   |   filedf8886tcd | solicitcntrbcd   |   exprstmntcd |   providegoodscd |   notfydnrvalcd |   filedf8282cd |   f8282cnt |   fndsrcvdcd |   premiumspaidcd |   filedf8899cd |   filedf1098ccd |   excbushldngscd |   s4966distribcd |   distribtodonorcd |   initiationfees |   grsrcptspublicuse |   grsincmembers |   grsincother |   filedlieuf1041cd |   txexmptint |   qualhlthplncd |   qualhlthreqmntn |   qualhlthonhnd | rcvdpdtngcd   |   filedf720cd |   totreprtabled |   totcomprelatede |   totestcompf |   noindiv100kcnt |   nocontractor100kcnt |   totcntrbgfts |   prgmservcode2acd |   totrev2acola |   prgmservcode2bcd |   totrev2bcola |   prgmservcode2ccd |   totrev2ccola |   prgmservcode2dcd |   totrev2dcola |   prgmservcode2ecd |   totrev2ecola |   totrev2fcola |   totprgmrevnue |   invstmntinc |   txexmptbndsproceeds |   royaltsinc |   grsrntsreal |   grsrntsprsnl |   rntlexpnsreal |   rntlexpnsprsnl |   rntlincreal |   rntlincprsnl |   netrntlinc |   grsalesecur |   grsalesothr |   cstbasisecur |   cstbasisothr |   gnlsecur |   gnlsothr |   netgnls |   grsincfndrsng |   lessdirfndrsng |   netincfndrsng |   grsincgaming |   lessdirgaming |   netincgaming |   grsalesinvent |   lesscstofgoods |   netincsales |   miscrev11acd |   miscrevtota |   miscrev11bcd |   miscrevtot11b |   miscrev11ccd |   miscrevtot11c |   miscrevtot11d |   miscrevtot11e |   totrevenue |   grntstogovt |   grnsttoindiv |   grntstofrgngovt |   benifitsmembrs |   compnsatncurrofcr |   compnsatnandothr |   othrsalwages |   pensionplancontrb |   othremplyeebenef |   payrolltx |   feesforsrvcmgmt |   legalfees |   accntingfees |   feesforsrvclobby |   profndraising |   feesforsrvcinvstmgmt |   feesforsrvcothr |   advrtpromo |   officexpns |   infotech |   royaltsexpns |   occupancy |   travel |   travelofpublicoffcl |   converconventmtng |   interestamt |   pymtoaffiliates |   deprcatndepletn |   insurance |   othrexpnsa |   othrexpnsb |   othrexpnsc |   othrexpnsd |   othrexpnse |   othrexpnsf |   totfuncexpns |   nonintcashend |   svngstempinvend |   pldgegrntrcvblend |   accntsrcvblend |   currfrmrcvblend |   rcvbldisqualend |   notesloansrcvblend |   invntriesalesend |   prepaidexpnsend |   lndbldgsequipend |   invstmntsend |   invstmntsothrend |   invstmntsprgmend |   intangibleassetsend |   othrassetsend |   totassetsend |   accntspayableend |   grntspayableend |   deferedrevnuend |   txexmptbndsend |   escrwaccntliabend |   paybletoffcrsend |   secrdmrtgsend |   unsecurednotesend |   othrliabend |   totliabend |   unrstrctnetasstsend |   temprstrctnetasstsend |   permrstrctnetasstsend |   capitalstktrstend |   paidinsurplusend |   retainedearnend |   totnetassetend |   totnetliabastend |   nonpfrea |   totnooforgscnt |   totsupport |   gftgrntsrcvd170 |   txrevnuelevied170 |   srvcsval170 |   pubsuppsubtot170 |   exceeds2pct170 |   pubsupplesspct170 |   samepubsuppsubtot170 |   grsinc170 |   netincunreltd170 |   othrinc170 |   totsupp170 |   grsrcptsrelated170 |   totgftgrntrcvd509 |   grsrcptsadmissn509 |   grsrcptsactivities509 |   txrevnuelevied509 |   srvcsval509 |   pubsuppsubtot509 |   rcvdfrmdisqualsub509 |   exceeds1pct509 |   subtotpub509 |   pubsupplesub509 |   samepubsuppsubtot509 |   grsinc509 |   unreltxincls511tx509 |   subtotsuppinc509 |   netincunrelatd509 |   othrinc509 |   totsupp509 |
|:--------|------------:|---------:|-----------:|:-------------------|:-----------|:--------------------|------------------:|:---------------|:--------------------|:-----------------|:------------------|:---------------------|:------------------------|:-------------------|:------------------|:-------------------|:---------------|:---------------|:---------------------|:---------------------|:------------------|:----------------------|:---------------|:-----------------|:--------------|:-------------------|:----------------------|:-------------------|:-----------------|:------------------|-------------------:|:-------------------|:-------------------|:---------------------|:---------------|------------------:|--------------------:|----------------:|---------------------:|-------------------:|:-----------------|:------------------|:-----------------|:------------------|:------------------|:----------------|:------------|:--------------------|:---------------|:--------------|:-------------|:---------------|--------------:|:--------------------|:---------------|-----------:|----------:|:-----------------|------------------:|:-----------------|:----------------|:---------------|:-------------|:--------------------|:------------------|----------------:|:-----------------|--------------:|-----------------:|----------------:|---------------:|-----------:|-------------:|-----------------:|---------------:|----------------:|-----------------:|-----------------:|-------------------:|-----------------:|--------------------:|----------------:|--------------:|-------------------:|-------------:|----------------:|------------------:|----------------:|:--------------|--------------:|----------------:|------------------:|--------------:|-----------------:|----------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|---------------:|----------------:|--------------:|----------------------:|-------------:|--------------:|---------------:|----------------:|-----------------:|--------------:|---------------:|-------------:|--------------:|--------------:|---------------:|---------------:|-----------:|-----------:|----------:|----------------:|-----------------:|----------------:|---------------:|----------------:|---------------:|----------------:|-----------------:|--------------:|---------------:|--------------:|---------------:|----------------:|---------------:|----------------:|----------------:|----------------:|-------------:|--------------:|---------------:|------------------:|-----------------:|--------------------:|-------------------:|---------------:|--------------------:|-------------------:|------------:|------------------:|------------:|---------------:|-------------------:|----------------:|-----------------------:|------------------:|-------------:|-------------:|-----------:|---------------:|------------:|---------:|----------------------:|--------------------:|--------------:|------------------:|------------------:|------------:|-------------:|-------------:|-------------:|-------------:|-------------:|-------------:|---------------:|----------------:|------------------:|--------------------:|-----------------:|------------------:|------------------:|---------------------:|-------------------:|------------------:|-------------------:|---------------:|-------------------:|-------------------:|----------------------:|----------------:|---------------:|-------------------:|------------------:|------------------:|-----------------:|--------------------:|-------------------:|----------------:|--------------------:|--------------:|-------------:|----------------------:|------------------------:|------------------------:|--------------------:|-------------------:|------------------:|-----------------:|-------------------:|-----------:|-----------------:|-------------:|------------------:|--------------------:|--------------:|-------------------:|-----------------:|--------------------:|-----------------------:|------------:|-------------------:|-------------:|-------------:|---------------------:|--------------------:|---------------------:|------------------------:|--------------------:|--------------:|-------------------:|-----------------------:|-----------------:|---------------:|------------------:|-----------------------:|------------:|-----------------------:|-------------------:|--------------------:|-------------:|-------------:|
| P       | 1.00189e+07 |   202004 |         19 | N                  | N          | N                   |               nan | N              | N                   | N                | N                 | N                    | N                       | N                  | N                 | N                  | N              | N              | N                    | Y                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              |               nan |                 nan |             nan |                  nan |                nan | N                | N                 | N                | N                 | N                 | N               | N           | N                   | N              | N             | N            | N              |           nan | N                   | Y              |          3 |         0 | N                |                 1 | Y                | N               | nan            | N            | N                   | N                 |             nan | N                |           nan |              nan |             nan |            nan |          0 |          nan |              nan |            nan |             nan |              nan |              nan |                nan |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |           40000 |                 0 |             0 |                0 |                     0 |         125780 |                nan |          54362 |                nan |          48087 |                nan |          23211 |                nan |           8294 |                nan |              0 |              0 |          133954 |           725 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |       260459 |             0 |           1100 |                 0 |                0 |               40000 |                  0 |           1800 |                   0 |                  0 |        2624 |                 0 |           0 |           2700 |                  0 |               0 |                      0 |                 0 |            0 |         7097 |          0 |              0 |       12560 |        0 |                     0 |               13719 |             0 |                 0 |                 0 |        4395 |        65485 |        60719 |        44495 |        13703 |         6032 |            0 |         276429 |          179298 |             36656 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |                  0 |              0 |                  0 |                  0 |                     0 |               0 |         215954 |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |             0 |            0 |                215954 |                       0 |                       0 |                   0 |                  0 |                 0 |           215954 |             215954 |        nan |                0 |            0 |                 0 |                   0 |             0 |                  0 |                0 |                   0 |                      0 |           0 |                  0 |            0 |            0 |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| E       | 1.00189e+07 |   202104 |         19 | N                  | N          | N                   |               nan | N              | N                   | N                | N                 | N                    | nan                     | N                  | N                 | N                  | N              | N              | N                    | Y                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              |               nan |                 nan |             nan |                  nan |                nan | N                | N                 | N                | N                 | nan               | N               | N           | N                   | N              | N             | N            | N              |           nan | N                   | Y              |          2 |         0 | N                |                 1 | Y                | N               | nan            | N            | N                   | N                 |             nan | N                |           nan |              nan |             nan |            nan |          0 |          nan |              nan |            nan |             nan |              nan |              nan |                nan |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |           42000 |                 0 |             0 |                0 |                     0 |         122786 |                nan |          20958 |                nan |           9018 |                nan |           2432 |                nan |           1091 |                nan |              0 |              0 |           33499 |           307 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |       156592 |             0 |              0 |                 0 |                0 |               42000 |                  0 |           1100 |                   0 |                  0 |        2878 |                 0 |           0 |           3000 |                  0 |               0 |                      0 |                 0 |            0 |         6200 |          0 |              0 |       12248 |        0 |                     0 |                 904 |             0 |                 0 |                 0 |         989 |        64897 |         6051 |         1657 |          260 |            0 |            0 |         142184 |          193408 |             36954 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |                  0 |              0 |                  0 |                  0 |                     0 |               0 |         230362 |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |             0 |            0 |                230362 |                       0 |                       0 |                   0 |                  0 |                 0 |           230362 |             230362 |        nan |                0 |            0 |                 0 |                   0 |             0 |                  0 |                0 |                   0 |                      0 |           0 |                  0 |            0 |            0 |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| E       | 1.00189e+07 |   202204 |         19 | N                  | N          | N                   |               nan | N              | N                   | N                | N                 | N                    | nan                     | N                  | N                 | N                  | N              | N              | N                    | Y                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              |               nan |                 nan |             nan |                  nan |                nan | N                | N                 | N                | N                 | nan               | N               | N           | N                   | N              | N             | N            | N              |           nan | N                   | Y              |          1 |         0 | N                |                 1 | Y                | N               | nan            | N            | N                   | N                 |             nan | N                |           nan |              nan |             nan |            nan |          0 |          nan |              nan |            nan |             nan |              nan |              nan |                nan |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |           42750 |                 0 |             0 |                0 |                     0 |         122782 |                nan |          43047 |                nan |          11477 |                nan |          10368 |                nan |              1 |                nan |              0 |              0 |           64893 |           194 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |       187869 |             0 |              0 |                 0 |                0 |               42750 |                  0 |           1350 |                   0 |                  0 |        2846 |                 0 |           0 |           3700 |                  0 |               0 |                      0 |                 0 |            0 |         6333 |          0 |              0 |       12552 |        0 |                     0 |                8447 |             0 |                 0 |                 0 |         989 |        63566 |        21942 |        11304 |            0 |            0 |            0 |         175779 |          205313 |             37139 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |                  0 |              0 |                  0 |                  0 |                     0 |               0 |         242452 |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |             0 |            0 |                242452 |                       0 |                       0 |                   0 |                  0 |                 0 |           242452 |             242452 |        nan |                0 |            0 |                 0 |                   0 |             0 |                  0 |                0 |                   0 |                      0 |           0 |                  0 |            0 |            0 |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| E       | 1.00189e+07 |   202105 |         19 | N                  | N          | N                   |               nan | N              | N                   | N                | N                 | N                    | nan                     | Y                  | Y                 | N                  | N              | Y              | N                    | N                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | Y                | N                 |                nan | N                  | N                  | N                    | N              |               nan |                 nan |             nan |                  nan |                nan | N                | N                 | N                | N                 | nan               | N               | N           | N                   | N              | N             | N            | N              |           nan | N                   | N              |          1 |         0 | N                |                14 | Y                | Y               | Y              | N            | N                   | N                 |             nan | N                |           nan |              nan |             nan |            nan |          0 |          nan |              nan |            nan |             nan |              nan |              nan |                nan |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |               0 |                 0 |             0 |                0 |                     0 |          15102 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |               0 |             4 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |             285 |                0 |             285 |         854292 |          748033 |         106259 |          101638 |            47148 |         54490 |            nan |          4656 |            nan |             371 |            nan |             345 |              41 |            5413 |       181553 |             0 |              0 |                 0 |              592 |                   0 |                  0 |          87593 |                   0 |                  0 |        7617 |                 0 |         675 |           5100 |                  0 |               0 |                      0 |                 0 |          273 |         2343 |          0 |              0 |         790 |        0 |                     0 |                   0 |             0 |              7661 |              9125 |        8376 |        15408 |        15013 |         4195 |         4045 |            0 |        15588 |         184394 |            9149 |             64413 |                   0 |                0 |                 0 |                 0 |                    0 |               3077 |                 0 |              29071 |              0 |              47576 |                  0 |                     0 |            2000 |         155286 |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |         26754 |        26754 |                     0 |                       0 |                       0 |                   0 |                  0 |            128532 |           128532 |             155286 |        nan |                0 |            0 |                 0 |                   0 |             0 |                  0 |                0 |                   0 |                      0 |           0 |                  0 |            0 |            0 |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| E       | 1.00189e+07 |   202205 |         19 | N                  | N          | N                   |               nan | N              | N                   | N                | N                 | N                    | nan                     | Y                  | Y                 | N                  | N              | Y              | N                    | N                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | Y                | N                 |                nan | N                  | N                  | N                    | N              |               nan |                 nan |             nan |                  nan |                nan | N                | N                 | N                | N                 | nan               | N               | N           | N                   | N              | N             | N            | N              |           nan | N                   | N              |          0 |         0 | N                |                13 | Y                | Y               | Y              | N            | N                   | N                 |             nan | N                |           nan |              nan |             nan |            nan |          0 |          nan |              nan |            nan |             nan |              nan |              nan |                nan |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |               0 |                 0 |             0 |                0 |                     0 |          27640 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |               0 |             5 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |            5981 |                0 |            5981 |         932945 |          804952 |         127993 |          144587 |            69923 |         74664 |            nan |         24100 |            nan |               3 |            nan |               0 |               0 |           24103 |       260386 |             0 |              0 |                 0 |                0 |                   0 |                  0 |         102765 |                   0 |                  0 |        8936 |                 0 |           0 |           5525 |                  0 |               0 |                      0 |                 0 |          502 |         4238 |          0 |              0 |       10258 |        0 |                     0 |                   0 |             0 |              7623 |              7568 |        6707 |        22195 |         9544 |         4620 |         3844 |            0 |        20953 |         215278 |            8645 |             95227 |                   0 |                0 |                 0 |                 0 |                    0 |               3077 |                 0 |              27982 |              0 |              49581 |                  0 |                     0 |               0 |         184512 |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |          3835 |         3835 |                     0 |                       0 |                       0 |                   0 |                  0 |            180677 |           180677 |             184512 |        nan |                0 |            0 |                 0 |                   0 |             0 |                  0 |                0 |                   0 |                      0 |           0 |                  0 |            0 |            0 |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |

Metadata:

| Column Name           | Data Type   | Unique Values   | Missing Values   |
|:----------------------|:------------|:----------------|:-----------------|
| efile                 | object      | 2               | 4                |
| EIN                   | float64     | 302,567         | 4                |
| tax_pd                | float64     | 115             | 4                |
| subseccd              | float64     | 24              | 4                |
| s501c3or4947a1cd      | object      | 2               | 59               |
| schdbind              | object      | 2               | 55               |
| politicalactvtscd     | object      | 2               | 53               |
| lbbyingactvtscd       | object      | 2               | 73,196           |
| subjto6033cd          | object      | 2               | 10,789           |
| dnradvisedfundscd     | object      | 2               | 50               |
| prptyintrcvdcd        | object      | 2               | 46               |
| maintwrkofartcd       | object      | 2               | 51               |
| crcounselingqstncd    | object      | 2               | 56               |
| hldassetsintermpermcd | object      | 2               | 316,793          |
| rptlndbldgeqptcd      | object      | 2               | 65               |
| rptinvstothsecd       | object      | 2               | 68               |
| rptinvstprgrelcd      | object      | 2               | 77               |
| rptothasstcd          | object      | 2               | 78               |
| rptothliabcd          | object      | 2               | 83               |
| sepcnsldtfinstmtcd    | object      | 2               | 86               |
| sepindaudfinstmtcd    | object      | 2               | 56               |
| inclinfinstmtcd       | object      | 2               | 88               |
| operateschools170cd   | object      | 2               | 66               |
| frgnofficecd          | object      | 2               | 70               |
| frgnrevexpnscd        | object      | 2               | 74               |
| frgngrntscd           | object      | 2               | 53               |
| frgnaggragrntscd      | object      | 2               | 52               |
| rptprofndrsngfeescd   | object      | 2               | 60               |
| rptincfnndrsngcd      | object      | 2               | 51               |
| rptincgamingcd        | object      | 2               | 53               |
| operatehosptlcd       | object      | 2               | 68               |
| hospaudfinstmtcd      | float64     | 2               | 322,523          |
| rptgrntstogovtcd      | object      | 2               | 73               |
| rptgrntstoindvcd      | object      | 2               | 63               |
| rptyestocompnstncd    | object      | 2               | 77               |
| txexmptbndcd          | object      | 2               | 84               |
| invstproceedscd       | object      | 2               | 313,308          |
| maintescrwaccntcd     | object      | 2               | 313,391          |
| actonbehalfcd         | object      | 2               | 313,505          |
| engageexcessbnftcd    | object      | 2               | 61,710           |
| awarexcessbnftcd      | object      | 2               | 61,514           |
| loantofficercd        | object      | 2               | 66               |
| grantoofficercd       | object      | 2               | 76               |
| dirbusnreltdcd        | object      | 2               | 69               |
| fmlybusnreltdcd       | object      | 2               | 88               |
| servasofficercd       | object      | 2               | 316,799          |
| recvnoncashcd         | object      | 2               | 70               |
| recvartcd             | object      | 2               | 60               |
| ceaseoperationscd     | object      | 2               | 68               |
| sellorexchcd          | object      | 2               | 63               |
| ownsepentcd           | object      | 2               | 60               |
| reltdorgcd            | object      | 2               | 70               |
| intincntrlcd          | object      | 2               | 310              |
| orgtrnsfrcd           | object      | 2               | 73,493           |
| conduct5percentcd     | object      | 2               | 78               |
| compltschocd          | object      | 2               | 133              |
| f1096cnt              | float64     | 2,831           | 4                |
| fw2gcnt               | float64     | 152             | 4                |
| wthldngrulescd        | object      | 2               | 122,684          |
| noemplyeesw3cnt       | float64     | 2,973           | 4                |
| filerqrdrtnscd        | object      | 2               | 123,640          |
| unrelbusinccd         | object      | 2               | 109              |
| filedf990tcd          | object      | 2               | 294,123          |
| frgnacctcd            | object      | 2               | 122              |
| prohibtdtxshltrcd     | object      | 2               | 88               |
| prtynotifyorgcd       | object      | 2               | 303              |
| filedf8886tcd         | object      | 2               | 323,361          |
| solicitcntrbcd        | object      | 2               | 136              |
| exprstmntcd           | object      | 2               | 315,315          |
| providegoodscd        | object      | 2               | 101,679          |
| notfydnrvalcd         | object      | 2               | 303,569          |
| filedf8282cd          | object      | 2               | 102,286          |
| f8282cnt              | float64     | 39              | 4                |
| fndsrcvdcd            | object      | 2               | 138,722          |
| premiumspaidcd        | object      | 2               | 138,694          |
| filedf8899cd          | object      | 2               | 248,621          |
| filedf1098ccd         | object      | 2               | 248,336          |
| excbushldngscd        | object      | 2               | 324,052          |
| s4966distribcd        | object      | 2               | 294,054          |
| distribtodonorcd      | object      | 2               | 294,352          |
| initiationfees        | float64     | 2,694           | 4                |
| grsrcptspublicuse     | float64     | 2,231           | 4                |
| grsincmembers         | float64     | 2,289           | 4                |
| grsincother           | float64     | 1,962           | 4                |
| filedlieuf1041cd      | object      | 1               | 322,811          |
| txexmptint            | float64     | 4               | 4                |
| qualhlthplncd         | object      | 2               | 320,991          |
| qualhlthreqmntn       | float64     | 8               | 4                |
| qualhlthonhnd         | float64     | 25              | 4                |
| rcvdpdtngcd           | object      | 2               | 429              |
| filedf720cd           | object      | 2               | 323,367          |
| totreprtabled         | float64     | 108,040         | 4                |
| totcomprelatede       | float64     | 25,017          | 4                |
| totestcompf           | float64     | 60,267          | 4                |
| noindiv100kcnt        | float64     | 944             | 4                |
| nocontractor100kcnt   | float64     | 414             | 4                |
| totcntrbgfts          | float64     | 219,353         | 4                |
| prgmservcode2acd      | float64     | 1,093           | 173,346          |
| totrev2acola          | float64     | 160,681         | 4                |
| prgmservcode2bcd      | float64     | 774             | 244,782          |
| totrev2bcola          | float64     | 76,944          | 4                |
| prgmservcode2ccd      | float64     | 580             | 279,064          |
| totrev2ccola          | float64     | 44,110          | 4                |
| prgmservcode2dcd      | float64     | 455             | 300,130          |
| totrev2dcola          | float64     | 25,391          | 4                |
| prgmservcode2ecd      | float64     | 374             | 312,378          |
| totrev2ecola          | float64     | 14,515          | 4                |
| totrev2fcola          | float64     | 13,938          | 4                |
| totprgmrevnue         | float64     | 172,610         | 4                |
| invstmntinc           | float64     | 73,932          | 4                |
| txexmptbndsproceeds   | float64     | 969             | 4                |
| royaltsinc            | float64     | 6,201           | 4                |
| grsrntsreal           | float64     | 26,604          | 4                |
| grsrntsprsnl          | float64     | 1,569           | 4                |
| rntlexpnsreal         | float64     | 14,631          | 4                |
| rntlexpnsprsnl        | float64     | 574             | 4                |
| rntlincreal           | float64     | 28,239          | 4                |
| rntlincprsnl          | float64     | 1,616           | 4                |
| netrntlinc            | float64     | 28,942          | 4                |
| grsalesecur           | float64     | 52,558          | 4                |
| grsalesothr           | float64     | 15,852          | 4                |
| cstbasisecur          | float64     | 47,130          | 4                |
| cstbasisothr          | float64     | 15,634          | 4                |
| gnlsecur              | float64     | 47,779          | 4                |
| gnlsothr              | float64     | 23,197          | 4                |
| netgnls               | float64     | 59,819          | 4                |
| grsincfndrsng         | float64     | 45,179          | 4                |
| lessdirfndrsng        | float64     | 38,808          | 4                |
| netincfndrsng         | float64     | 47,946          | 4                |
| grsincgaming          | float64     | 12,146          | 4                |
| lessdirgaming         | float64     | 10,817          | 4                |
| netincgaming          | float64     | 12,038          | 4                |
| grsalesinvent         | float64     | 31,967          | 4                |
| lesscstofgoods        | float64     | 28,316          | 4                |
| netincsales           | float64     | 30,344          | 4                |
| miscrev11acd          | float64     | 681             | 243,083          |
| miscrevtota           | float64     | 53,703          | 4                |
| miscrev11bcd          | float64     | 448             | 297,611          |
| miscrevtot11b         | float64     | 22,378          | 4                |
| miscrev11ccd          | float64     | 315             | 314,436          |
| miscrevtot11c         | float64     | 10,730          | 4                |
| miscrevtot11d         | float64     | 10,057          | 4                |
| miscrevtot11e         | float64     | 60,898          | 4                |
| totrevenue            | float64     | 291,897         | 4                |
| grntstogovt           | float64     | 41,319          | 4                |
| grnsttoindiv          | float64     | 28,863          | 4                |
| grntstofrgngovt       | float64     | 9,704           | 4                |
| benifitsmembrs        | float64     | 12,054          | 4                |
| compnsatncurrofcr     | float64     | 103,451         | 4                |
| compnsatnandothr      | float64     | 7,113           | 4                |
| othrsalwages          | float64     | 164,065         | 4                |
| pensionplancontrb     | float64     | 51,113          | 4                |
| othremplyeebenef      | float64     | 91,840          | 4                |
| payrolltx             | float64     | 97,103          | 4                |
| feesforsrvcmgmt       | float64     | 32,577          | 4                |
| legalfees             | float64     | 40,390          | 4                |
| accntingfees          | float64     | 52,646          | 4                |
| feesforsrvclobby      | float64     | 6,825           | 4                |
| profndraising         | float64     | 6,810           | 4                |
| feesforsrvcinvstmgmt  | float64     | 29,304          | 4                |
| feesforsrvcothr       | float64     | 91,138          | 4                |
| advrtpromo            | float64     | 50,907          | 4                |
| officexpns            | float64     | 80,120          | 4                |
| infotech              | float64     | 50,211          | 4                |
| royaltsexpns          | float64     | 2,611           | 4                |
| occupancy             | float64     | 114,974         | 4                |
| travel                | float64     | 48,208          | 4                |
| travelofpublicoffcl   | float64     | 1,386           | 4                |
| converconventmtng     | float64     | 41,412          | 4                |
| interestamt           | float64     | 45,153          | 4                |
| pymtoaffiliates       | float64     | 11,856          | 4                |
| deprcatndepletn       | float64     | 100,405         | 4                |
| insurance             | float64     | 72,911          | 4                |
| othrexpnsa            | float64     | 147,171         | 4                |
| othrexpnsb            | float64     | 101,478         | 4                |
| othrexpnsc            | float64     | 78,174          | 4                |
| othrexpnsd            | float64     | 61,365          | 4                |
| othrexpnse            | float64     | 4,157           | 4                |
| othrexpnsf            | float64     | 77,256          | 4                |
| totfuncexpns          | float64     | 287,489         | 4                |
| nonintcashend         | float64     | 219,323         | 4                |
| svngstempinvend       | float64     | 144,007         | 4                |
| pldgegrntrcvblend     | float64     | 45,168          | 4                |
| accntsrcvblend        | float64     | 88,070          | 4                |
| currfrmrcvblend       | float64     | 2,318           | 4                |
| rcvbldisqualend       | float64     | 345             | 4                |
| notesloansrcvblend    | float64     | 16,251          | 4                |
| invntriesalesend      | float64     | 34,053          | 4                |
| prepaidexpnsend       | float64     | 65,463          | 4                |
| lndbldgsequipend      | float64     | 167,873         | 4                |
| invstmntsend          | float64     | 77,435          | 4                |
| invstmntsothrend      | float64     | 26,937          | 4                |
| invstmntsprgmend      | float64     | 8,688           | 4                |
| intangibleassetsend   | float64     | 11,043          | 4                |
| othrassetsend         | float64     | 72,879          | 4                |
| totassetsend          | float64     | 303,295         | 4                |
| accntspayableend      | float64     | 116,076         | 4                |
| grntspayableend       | float64     | 6,319           | 4                |
| deferedrevnuend       | float64     | 55,752          | 4                |
| txexmptbndsend        | float64     | 5,875           | 4                |
| escrwaccntliabend     | float64     | 6,363           | 4                |
| paybletoffcrsend      | float64     | 4,552           | 4                |
| secrdmrtgsend         | float64     | 52,220          | 4                |
| unsecurednotesend     | float64     | 21,736          | 4                |
| othrliabend           | float64     | 81,595          | 4                |
| totliabend            | float64     | 176,487         | 4                |
| unrstrctnetasstsend   | float64     | 135,082         | 4                |
| temprstrctnetasstsend | float64     | 46,602          | 4                |
| permrstrctnetasstsend | float64     | 85              | 4                |
| capitalstktrstend     | float64     | 12,542          | 4                |
| paidinsurplusend      | float64     | 6,418           | 4                |
| retainedearnend       | float64     | 67,507          | 4                |
| totnetassetend        | float64     | 300,764         | 4                |
| totnetliabastend      | float64     | 303,337         | 4                |
| nonpfrea              | float64     | 15              | 73,594           |
| totnooforgscnt        | float64     | 86              | 4                |
| totsupport            | float64     | 8,015           | 4                |
| gftgrntsrcvd170       | float64     | 113,604         | 4                |
| txrevnuelevied170     | float64     | 2,443           | 4                |
| srvcsval170           | float64     | 2,876           | 4                |
| pubsuppsubtot170      | float64     | 113,800         | 4                |
| exceeds2pct170        | float64     | 42,382          | 4                |
| pubsupplesspct170     | float64     | 113,719         | 4                |
| samepubsuppsubtot170  | float64     | 113,800         | 4                |
| grsinc170             | float64     | 51,910          | 4                |
| netincunreltd170      | float64     | 8,383           | 4                |
| othrinc170            | float64     | 35,188          | 4                |
| totsupp170            | float64     | 114,386         | 4                |
| grsrcptsrelated170    | float64     | 49,840          | 4                |
| totgftgrntrcvd509     | float64     | 83,625          | 4                |
| grsrcptsadmissn509    | float64     | 66,560          | 4                |
| grsrcptsactivities509 | float64     | 11,609          | 4                |
| txrevnuelevied509     | float64     | 832             | 4                |
| srvcsval509           | float64     | 857             | 4                |
| pubsuppsubtot509      | float64     | 92,621          | 3                |
| rcvdfrmdisqualsub509  | float64     | 10,513          | 4                |
| exceeds1pct509        | float64     | 7,215           | 4                |
| subtotpub509          | float64     | 14,966          | 4                |
| pubsupplesub509       | float64     | 92,478          | 4                |
| samepubsuppsubtot509  | float64     | 92,616          | 4                |
| grsinc509             | float64     | 34,084          | 4                |
| unreltxincls511tx509  | float64     | 1,400           | 4                |
| subtotsuppinc509      | float64     | 34,431          | 4                |
| netincunrelatd509     | float64     | 4,638           | 4                |
| othrinc509            | float64     | 19,787          | 4                |
| totsupp509            | float64     | 92,791          | 4                |

# 3. Data Preparation

In this section, we detail the initial steps taken to prepare the 990 Form from the IRS for analysis. Our goals are to ensure consistency in column naming, handle missing values appropriately, and convert data into formats that are suitable for our analytical needs. Please click the drop down arrow for more details on code used to achieve this.

In [2]:
# Standardize column names.
form_990_2022.columns = [x.lower() for x in form_990_2022.columns]

# Replace zeros with NaN for appropriate columns.

# Replace NaN with appropriate values accordingly.

# Convert columns to appropriate data types.
date_cols = ['tax_pd']
for col in date_cols:
    form_990_2022[col] = form_990_2022[col].astype(str).str.replace('\.0$', '', regex=True)
    form_990_2022[col] = pd.to_datetime(form_990_2022[col], format='%Y%m', errors='coerce')

# Drop duplicates by keeping last tax_pd date.
form_990_2022 = form_990_2022.sort_values('tax_pd').drop_duplicates('ein',keep='last') 

# Convert dtype for appropriate columns.
form_990_2022['ein'] = form_990_2022['ein'].astype(str).str.replace('\.0$', '', regex=True)

# Show cleaned data.
head_caption = "Cleaned data sample view:"
head_df = form_990_2022.head().copy()
head_markdown = head_caption + "\n\n" + head_df.to_markdown(index=False)
display(Markdown(head_markdown))

Cleaned data sample view:

| efile   |       ein | tax_pd              |   subseccd | s501c3or4947a1cd   | schdbind   | politicalactvtscd   | lbbyingactvtscd   | subjto6033cd   | dnradvisedfundscd   | prptyintrcvdcd   | maintwrkofartcd   | crcounselingqstncd   | hldassetsintermpermcd   | rptlndbldgeqptcd   | rptinvstothsecd   | rptinvstprgrelcd   | rptothasstcd   | rptothliabcd   | sepcnsldtfinstmtcd   | sepindaudfinstmtcd   | inclinfinstmtcd   | operateschools170cd   | frgnofficecd   | frgnrevexpnscd   | frgngrntscd   | frgnaggragrntscd   | rptprofndrsngfeescd   | rptincfnndrsngcd   | rptincgamingcd   | operatehosptlcd   |   hospaudfinstmtcd | rptgrntstogovtcd   | rptgrntstoindvcd   | rptyestocompnstncd   | txexmptbndcd   | invstproceedscd   | maintescrwaccntcd   | actonbehalfcd   | engageexcessbnftcd   | awarexcessbnftcd   | loantofficercd   | grantoofficercd   | dirbusnreltdcd   | fmlybusnreltdcd   | servasofficercd   | recvnoncashcd   | recvartcd   | ceaseoperationscd   | sellorexchcd   | ownsepentcd   | reltdorgcd   | intincntrlcd   | orgtrnsfrcd   | conduct5percentcd   | compltschocd   |   f1096cnt |   fw2gcnt | wthldngrulescd   |   noemplyeesw3cnt | filerqrdrtnscd   | unrelbusinccd   |   filedf990tcd | frgnacctcd   | prohibtdtxshltrcd   | prtynotifyorgcd   |   filedf8886tcd | solicitcntrbcd   |   exprstmntcd | providegoodscd   |   notfydnrvalcd | filedf8282cd   |   f8282cnt | fndsrcvdcd   | premiumspaidcd   | filedf8899cd   | filedf1098ccd   | excbushldngscd   | s4966distribcd   | distribtodonorcd   |   initiationfees |   grsrcptspublicuse |   grsincmembers |   grsincother |   filedlieuf1041cd |   txexmptint |   qualhlthplncd |   qualhlthreqmntn |   qualhlthonhnd | rcvdpdtngcd   |   filedf720cd |   totreprtabled |   totcomprelatede |   totestcompf |   noindiv100kcnt |   nocontractor100kcnt |   totcntrbgfts |   prgmservcode2acd |   totrev2acola |   prgmservcode2bcd |   totrev2bcola |   prgmservcode2ccd |   totrev2ccola |   prgmservcode2dcd |   totrev2dcola |   prgmservcode2ecd |   totrev2ecola |   totrev2fcola |   totprgmrevnue |   invstmntinc |   txexmptbndsproceeds |   royaltsinc |   grsrntsreal |   grsrntsprsnl |   rntlexpnsreal |   rntlexpnsprsnl |   rntlincreal |   rntlincprsnl |   netrntlinc |   grsalesecur |   grsalesothr |   cstbasisecur |   cstbasisothr |   gnlsecur |   gnlsothr |   netgnls |   grsincfndrsng |   lessdirfndrsng |   netincfndrsng |   grsincgaming |   lessdirgaming |   netincgaming |   grsalesinvent |   lesscstofgoods |   netincsales |   miscrev11acd |   miscrevtota |   miscrev11bcd |   miscrevtot11b |   miscrev11ccd |   miscrevtot11c |   miscrevtot11d |   miscrevtot11e |   totrevenue |   grntstogovt |   grnsttoindiv |   grntstofrgngovt |   benifitsmembrs |   compnsatncurrofcr |   compnsatnandothr |   othrsalwages |   pensionplancontrb |   othremplyeebenef |   payrolltx |   feesforsrvcmgmt |   legalfees |   accntingfees |   feesforsrvclobby |   profndraising |   feesforsrvcinvstmgmt |   feesforsrvcothr |   advrtpromo |   officexpns |   infotech |   royaltsexpns |   occupancy |   travel |   travelofpublicoffcl |   converconventmtng |   interestamt |   pymtoaffiliates |   deprcatndepletn |   insurance |   othrexpnsa |   othrexpnsb |   othrexpnsc |   othrexpnsd |   othrexpnse |   othrexpnsf |   totfuncexpns |   nonintcashend |   svngstempinvend |   pldgegrntrcvblend |   accntsrcvblend |   currfrmrcvblend |   rcvbldisqualend |   notesloansrcvblend |   invntriesalesend |   prepaidexpnsend |   lndbldgsequipend |   invstmntsend |   invstmntsothrend |   invstmntsprgmend |   intangibleassetsend |   othrassetsend |     totassetsend |   accntspayableend |   grntspayableend |   deferedrevnuend |   txexmptbndsend |   escrwaccntliabend |   paybletoffcrsend |   secrdmrtgsend |   unsecurednotesend |   othrliabend |   totliabend |   unrstrctnetasstsend |   temprstrctnetasstsend |   permrstrctnetasstsend |   capitalstktrstend |   paidinsurplusend |   retainedearnend |   totnetassetend |   totnetliabastend |   nonpfrea |   totnooforgscnt |   totsupport |   gftgrntsrcvd170 |   txrevnuelevied170 |   srvcsval170 |   pubsuppsubtot170 |   exceeds2pct170 |   pubsupplesspct170 |   samepubsuppsubtot170 |   grsinc170 |   netincunreltd170 |   othrinc170 |       totsupp170 |   grsrcptsrelated170 |   totgftgrntrcvd509 |   grsrcptsadmissn509 |   grsrcptsactivities509 |   txrevnuelevied509 |   srvcsval509 |   pubsuppsubtot509 |   rcvdfrmdisqualsub509 |   exceeds1pct509 |   subtotpub509 |   pubsupplesub509 |   samepubsuppsubtot509 |   grsinc509 |   unreltxincls511tx509 |   subtotsuppinc509 |   netincunrelatd509 |   othrinc509 |   totsupp509 |
|:--------|----------:|:--------------------|-----------:|:-------------------|:-----------|:--------------------|:------------------|:---------------|:--------------------|:-----------------|:------------------|:---------------------|:------------------------|:-------------------|:------------------|:-------------------|:---------------|:---------------|:---------------------|:---------------------|:------------------|:----------------------|:---------------|:-----------------|:--------------|:-------------------|:----------------------|:-------------------|:-----------------|:------------------|-------------------:|:-------------------|:-------------------|:---------------------|:---------------|:------------------|:--------------------|:----------------|:---------------------|:-------------------|:-----------------|:------------------|:-----------------|:------------------|:------------------|:----------------|:------------|:--------------------|:---------------|:--------------|:-------------|:---------------|:--------------|:--------------------|:---------------|-----------:|----------:|:-----------------|------------------:|:-----------------|:----------------|---------------:|:-------------|:--------------------|:------------------|----------------:|:-----------------|--------------:|:-----------------|----------------:|:---------------|-----------:|:-------------|:-----------------|:---------------|:----------------|:-----------------|:-----------------|:-------------------|-----------------:|--------------------:|----------------:|--------------:|-------------------:|-------------:|----------------:|------------------:|----------------:|:--------------|--------------:|----------------:|------------------:|--------------:|-----------------:|----------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|-------------------:|---------------:|---------------:|----------------:|--------------:|----------------------:|-------------:|--------------:|---------------:|----------------:|-----------------:|--------------:|---------------:|-------------:|--------------:|--------------:|---------------:|---------------:|-----------:|-----------:|----------:|----------------:|-----------------:|----------------:|---------------:|----------------:|---------------:|----------------:|-----------------:|--------------:|---------------:|--------------:|---------------:|----------------:|---------------:|----------------:|----------------:|----------------:|-------------:|--------------:|---------------:|------------------:|-----------------:|--------------------:|-------------------:|---------------:|--------------------:|-------------------:|------------:|------------------:|------------:|---------------:|-------------------:|----------------:|-----------------------:|------------------:|-------------:|-------------:|-----------:|---------------:|------------:|---------:|----------------------:|--------------------:|--------------:|------------------:|------------------:|------------:|-------------:|-------------:|-------------:|-------------:|-------------:|-------------:|---------------:|----------------:|------------------:|--------------------:|-----------------:|------------------:|------------------:|---------------------:|-------------------:|------------------:|-------------------:|---------------:|-------------------:|-------------------:|----------------------:|----------------:|-----------------:|-------------------:|------------------:|------------------:|-----------------:|--------------------:|-------------------:|----------------:|--------------------:|--------------:|-------------:|----------------------:|------------------------:|------------------------:|--------------------:|-------------------:|------------------:|-----------------:|-------------------:|-----------:|-----------------:|-------------:|------------------:|--------------------:|--------------:|-------------------:|-----------------:|--------------------:|-----------------------:|------------:|-------------------:|-------------:|-----------------:|---------------------:|--------------------:|---------------------:|------------------------:|--------------------:|--------------:|-------------------:|-----------------------:|-----------------:|---------------:|------------------:|-----------------------:|------------:|-----------------------:|-------------------:|--------------------:|-------------:|-------------:|
| P       | 426057254 | 2011-06-01 00:00:00 |          2 | N                  | N          | N                   | nan               | N              | N                   | N                | N                 | N                    | N                       | Y                  | N                 | N                  | N              | Y              | N                    | N                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              | nan               | nan                 | nan             | N                    | N                  | N                | N                 | N                | N                 | N                 | N               | N           | N                   | N              | N             | N            | N              | N             | N                   | Y              |          0 |         0 | Y                |                 0 | nan              | N               |            nan | N            | N                   | N                 |             nan | N                |           nan | nan              |             nan | nan            |          0 | nan          | nan              | nan            | nan             | nan              | nan              | nan                |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |               0 |                 0 |             0 |                0 |                     0 |              0 |                nan |         141124 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |          141124 |             0 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |       141124 |             0 |              0 |                 0 |                0 |                   0 |                  0 |              0 |                   0 |                  0 |           0 |                 0 |           0 |              0 |                  0 |               0 |                      0 |                 0 |            0 |          313 |          0 |              0 |           0 |        0 |                     0 |                   0 |         51812 |                 0 |                 0 |       10006 |        21664 |        26817 |         1000 |            0 |            0 |            0 |         111612 |           52057 |                 0 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |        1.45743e+06 |              0 |                  0 |                  0 |                     0 |               0 |      1.50949e+06 |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |          622410 |                   0 |          7600 |       630010 |                879477 |                       0 |                       0 |                   0 |                  0 |                 0 |           879477 |        1.50949e+06 |        nan |                0 |            0 |       0           |                   0 |             0 |        0           |                0 |         0           |            0           |           0 |                  0 |            0 |      0           |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| P       |  60891737 | 2011-10-01 00:00:00 |          5 | N                  | N          | N                   | nan               | N              | N                   | N                | N                 | N                    | N                       | Y                  | N                 | N                  | N              | N              | N                    | N                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              | N                 | N                   | N               | nan                  | nan                | N                | N                 | N                | N                 | N                 | N               | N           | N                   | N              | N             | N            | N              | nan           | N                   | Y              |          0 |         0 | Y                |                43 | Y                | N               |            nan | N            | N                   | N                 |             nan | N                |           nan | nan              |             nan | nan            |          0 | nan          | nan              | nan            | nan             | nan              | nan              | nan                |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |               0 |                 0 |             0 |                0 |                     0 |              0 |                nan |         296865 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |          296865 |           231 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |          2703 |            nan |            3000 |            nan |               0 |               0 |            5703 |       302799 |             0 |              0 |                 0 |                0 |                   0 |                  0 |          30171 |                   0 |                  0 |        3209 |                 0 |           0 |              0 |                  0 |               0 |                      0 |             31848 |        23347 |          330 |          0 |              0 |       46963 |        0 |                     0 |                5067 |         20204 |                 0 |             33976 |       19021 |        41487 |         5184 |        41184 |        37261 |        78940 |            0 |         418192 |           17327 |             10879 |                   0 |                0 |                 0 |                 0 |                 2602 |                  0 |                 0 |   934054           |              0 |                  0 |                  0 |                     0 |               0 | 964862           |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |          375650 |                   0 |             0 |       375650 |                589212 |                       0 |                       0 |                   0 |                  0 |                 0 |           589212 |   964862           |        nan |                0 |            0 |       0           |                   0 |             0 |        0           |                0 |         0           |            0           |           0 |                  0 |            0 |      0           |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| P       | 660550623 | 2011-12-01 00:00:00 |          3 | Y                  | Y          | N                   | N                 | N              | N                   | N                | N                 | N                    | N                       | N                  | N                 | N                  | N              | N              | N                    | N                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              | nan               | nan                 | nan             | N                    | N                  | N                | N                 | N                | N                 | N                 | N               | N           | N                   | N              | N             | N            | N              | N             | N                   | Y              |      20319 |         0 | N                |                 2 | Y                | N               |            nan | N            | N                   | N                 |             nan | N                |           nan | nan              |             nan | nan            |          0 | nan          | nan              | nan            | nan             | nan              | nan              | nan                |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |               0 |                 0 |             0 |                0 |                     0 |          51958 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |               0 |             0 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |        51958 |             0 |              0 |                 0 |                0 |                   0 |                  0 |          20790 |                   0 |               1193 |        2033 |                 0 |           0 |              0 |                  0 |               0 |                      0 |             11954 |         2844 |         5521 |          0 |              0 |           0 |     4832 |                     0 |                   0 |             0 |                 0 |                 0 |         857 |            0 |            0 |            0 |            0 |            0 |            0 |          50024 |           55339 |                 0 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |        0           |              0 |                  0 |                  0 |                     0 |               0 |  55339           |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |             0 |            0 |                     0 |                       0 |                       0 |                   0 |                  0 |             55339 |            55339 |    55339           |          7 |                0 |            0 |  297377           |               24187 |         20450 |   342014           |                0 |    342014           |       342014           |           0 |                  0 |            0 | 342014           |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| P       | 464039105 | 2012-06-01 00:00:00 |          3 | Y                  | Y          | N                   | N                 | N              | N                   | N                | N                 | N                    | N                       | N                  | N                 | N                  | N              | Y              | N                    | Y                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              | nan               | nan                 | nan             | N                    | N                  | N                | N                 | N                | N                 | N                 | N               | N           | N                   | N              | N             | Y            | N              | N             | N                   | Y              |          0 |         0 | Y                |                22 | Y                | N               |            nan | N            | N                   | N                 |             nan | N                |           nan | N                |             nan | N              |          0 | N            | N                | nan            | nan             | nan              | nan              | nan                |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |           53800 |                 0 |             0 |                0 |                     0 |         768494 |             616000 |          99738 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |           99738 |             0 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |       868232 |             0 |              0 |                 0 |                0 |               53800 |                  0 |         517025 |                   0 |              18559 |       43668 |                 0 |           0 |              0 |                  0 |               0 |                      0 |                 0 |            0 |        21734 |          0 |              0 |       64500 |       82 |                     0 |                   0 |          3858 |                 0 |                 0 |       17371 |        29928 |        88642 |         9241 |        45458 |        20206 |            0 |         934072 |           96395 |                 0 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |        0           |              0 |                  0 |                  0 |                     0 |               0 |  96395           |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |               82900 |         36058 |       118958 |                -22563 |                       0 |                       0 |                   0 |                  0 |                 0 |           -22563 |    96395           |          7 |                0 |            0 |       4.03637e+06 |                   0 |             0 |        4.03637e+06 |                0 |         4.03637e+06 |            4.03637e+06 |           0 |                  0 |            0 |      4.03637e+06 |                    0 |                   0 |                    0 |                       0 |                   0 |             0 |                  0 |                      0 |                0 |              0 |                 0 |                      0 |           0 |                      0 |                  0 |                   0 |            0 |            0 |
| P       | 161696098 | 2012-12-01 00:00:00 |          3 | Y                  | N          | N                   | N                 | N              | N                   | N                | N                 | N                    | N                       | N                  | N                 | N                  | N              | N              | N                    | N                    | N                 | N                     | N              | N                | N             | N                  | N                     | N                  | N                | N                 |                nan | N                  | N                  | N                    | N              | nan               | nan                 | nan             | N                    | N                  | N                | N                 | N                | N                 | N                 | N               | N           | N                   | N              | N             | N            | N              | N             | N                   | Y              |          0 |         0 | N                |                 0 | N                | N               |            nan | N            | N                   | N                 |             nan | N                |           nan | N                |             nan | N              |          0 | N            | N                | N              | N               | N                | N                | N                  |                0 |                   0 |               0 |             0 |                nan |            0 |             nan |                 0 |               0 | N             |           nan |               0 |                 0 |             0 |                0 |                     0 |           7760 |             900099 |              0 |                nan |              0 |                nan |              0 |                nan |              0 |                nan |              0 |              0 |               0 |             0 |                     0 |            0 |             0 |              0 |               0 |                0 |             0 |              0 |            0 |             0 |             0 |              0 |              0 |          0 |          0 |         0 |               0 |                0 |               0 |              0 |               0 |              0 |               0 |                0 |             0 |            nan |             0 |            nan |               0 |            nan |               0 |               0 |               0 |         7760 |             0 |              0 |                 0 |                0 |                   0 |                  0 |              0 |                   0 |                  0 |           0 |               612 |           0 |              0 |                  0 |               0 |                      0 |                 0 |            0 |         3916 |          0 |              0 |           0 |        0 |                     0 |                   0 |             0 |                 0 |                 0 |           0 |         2160 |         1548 |         1699 |          500 |         2617 |            0 |          13052 |           -5292 |                 0 |                   0 |                0 |                 0 |                 0 |                    0 |                  0 |                 0 |        0           |              0 |                  0 |                  0 |                     0 |               0 |  -5292           |                  0 |                 0 |                 0 |                0 |                   0 |                  0 |               0 |                   0 |             0 |            0 |                 -5292 |                       0 |                       0 |                   0 |                  0 |                 0 |            -5292 |    -5292           |          9 |                0 |            0 |       0           |                   0 |             0 |        0           |                0 |         0           |            0           |           0 |                  0 |            0 |      0           |                    0 |              121247 |                    0 |                       0 |                   0 |             0 |             121247 |                      0 |                0 |              0 |            121247 |                 121247 |           0 |                      0 |                  0 |                   0 |            0 |       121247 |

# 4. Analysis
Objective: Determine if the 990 Form can be used to conduct a comprehensive analysis of existing philanthropic giving in environmental and social justice. Additionally, can it be used to assess the level of transparency and accountability in current giving practices?


In [3]:
def display_head(df, columns, caption):
    head_df = df[columns].head()
    head_markdown = f"{caption}\n\n{head_df.to_markdown(index=False)}"
    display(Markdown(head_markdown))
def display_unique_values(df, columns):
    unique_val_df = df[columns].value_counts().reset_index()
    uni_markdown = f"{unique_val_df.to_markdown(index=False)}"
    display(Markdown(uni_markdown))
def display_missing_cts(df, columns):
    missing_values = df[columns].isna().sum().reset_index()
    missing_markdown = f"{missing_values.rename(columns={'index':'column name',0:'missing data points'}).to_markdown(index=False)}"
    display(Markdown(missing_markdown))
def display_stats(df, columns):
    stats = df[columns].describe().reset_index()
    for col in stats.columns[1:]:  
        if df[col].dtype == 'float64' or df[col].dtype == 'float32':
            stats[col] = stats[col].astype(float).apply(lambda x: f"{x:,.0f}")
    stats_markdown = stats.rename(columns={'index': 'Statistics'}).to_markdown(index=False)
    display(Markdown(stats_markdown))

## 4.1. Indentifying Relevant Organizations

The 990 Form can be filtered to reflect organizations by their codes (reflecting their primary mission), enabling the identification of nonprofits focused on environmental protection, social justice, advocacy, and related activities. This step is crucial for creating a focused dataset of relevant organizations for our objective above.

Assuming that the action item (below) has been completed, the next order of data maniputlation should involve filtering the 990 Form in the same manner as the Exempt Organizations Business Master File was filtered for consistency. 

**Action item**: Review with team to determine which column makes the most sense to use to filter relevant orgs. Options include:
* Subection and Classification codes.
* National Taxonomy of Exempt Entities (NTEE) codes (many are missing unfortunately).
* Foundation codes.
* Activity codes (most likely not useful since becoming obsolete with the adoption of the NTEE coding system in January 1995).

In [4]:
# Insert code here for appropriate filtering if necessary.

## 4.2 Compliance and Legal Requirements

Analyzing compliance and legal requirements for tax-exempt organizations helps assess their transparency and accountability. It ensures they adhere to legal and financial standards, showcases their commitment to ethical practices, and reflects their operational integrity.

### 4.2.1 Tax Compliance and Reporting 

* **Compliance with backup withholding?** (wthldngrulescd): 
  * Assess whether organizations comply with backup withholding requirements, indicating adherence to tax obligations.
* **Employment tax returns filed?** (filerqrdrtnscd): 
  * Indicates whether the organization has filed the required employment tax returns (like Form 941 or Form 944), which report wages paid and the taxes withheld from employees.
* **Form 990-T filed?** (filedf990tcd): 
  * This form is filed by tax-exempt organizations that have unrelated business income (UBI) that is taxable.


In [5]:
head_columns = ['ein', 'wthldngrulescd','filerqrdrtnscd','filedf990tcd']
#display_head(form_990_2022, head_columns, "Tax Compliance/Reporting Columns, Unique Values and Missing Data Counts:")

unique_columns = ['wthldngrulescd','filerqrdrtnscd','filedf990tcd']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| Statistics   | wthldngrulescd   | filerqrdrtnscd   | filedf990tcd   |
|:-------------|:-----------------|:-----------------|:---------------|
| count        | 189006           | 188734           | 29617          |
| unique       | 2                | 2                | 2              |
| top          | Y                | Y                | Y              |
| freq         | 140610           | 172425           | 24954          |

| column name    |   missing data points |
|:---------------|----------------------:|
| wthldngrulescd |                113562 |
| filerqrdrtnscd |                113834 |
| filedf990tcd   |                272951 |

### 4.2.2 Specific Transaction Reporting

* **Form 1098-C filed?** (filedf1098ccd): 
    * This form is filed when a charitable organization receives a vehicle donation valued over $500. It asks if the organization provided the donor with a written acknowledgment (Form 1098-C) of the donation. Analyze for transparency in reporting and tax compliance.
* **Form 8282 property disposed of?** (filedf8282cd):  
    * If the organization disposed of charitable deduction property or other types of property for which it had to use Form 8283 to report to the donors, Form 8282 must be filed within 125 days after the disposition of the property. Analyze for transparency in reporting and tax compliance.
* **Form 8886-T filed?** (filedf8886tcd): 
    * Used by tax-exempt organizations to disclose information about "prohibited tax shelter transactions" or certain other types of transactions they are involved in, ensuring transparency and compliance with tax regulations.
* **Form 8899 filed?** (filedf8899cd): 
    * Filed when an organization receives donations of intellectual property and earns income from it. Analyze for transparency in reporting and tax compliance.

In [6]:
head_columns = ['ein', 'filedf1098ccd','filedf8282cd','filedf8886tcd','filedf8899cd']
#display_head(form_990_2022, head_columns, "Specific Transaction Reporting Columns, Unique Values and Missing Data Counts:")

unique_columns = ['filedf1098ccd','filedf8282cd','filedf8886tcd','filedf8899cd']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| Statistics   | filedf1098ccd   | filedf8282cd   | filedf8886tcd   | filedf8899cd   |
|:-------------|:----------------|:---------------|:----------------|:---------------|
| count        | 71888           | 208163         | 2141            | 71600          |
| unique       | 2               | 2              | 2               | 2              |
| top          | N               | N              | N               | N              |
| freq         | 70310           | 207732         | 2125            | 71098          |

| column name   |   missing data points |
|:--------------|----------------------:|
| filedf1098ccd |                230680 |
| filedf8282cd  |                 94405 |
| filedf8886tcd |                300427 |
| filedf8899cd  |                230968 |

### 4.2.3 Financial Transparency and Accountability

* **Included in consolidated financial statements?** (inclinfinstmtcd): 
  * May determine if the entity's financial information is reported as part of the collective financial results of a larger group to which it belongs.
* **Separate audited financial statement** (sepindaudfinstmtcd): 
  * This asks if the organization has prepared a set of financial statements that were audited by an independent auditor following generally accepted auditing standards.

In [7]:
head_columns = ['ein', 'sepindaudfinstmtcd','inclinfinstmtcd']
#display_head(form_990_2022, head_columns, "Financial Transparency & Accountability Columns, Unique Values and Missing Data Counts:")

unique_columns = [ 'sepindaudfinstmtcd','inclinfinstmtcd']
display_unique_values(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| sepindaudfinstmtcd   | inclinfinstmtcd   |   count |
|:---------------------|:------------------|--------:|
| N                    | N                 |  190129 |
| Y                    | N                 |   78595 |
| N                    | Y                 |   27062 |
| Y                    | Y                 |    6743 |

| column name        |   missing data points |
|:-------------------|----------------------:|
| sepindaudfinstmtcd |                    26 |
| inclinfinstmtcd    |                    39 |

### 4.2.4 Schedules 

* **Schedule O completed?** (compltschocd): 
  * Used to provide additional information to the IRS that is not covered elsewhere in Form 990 or its schedules.
* **Schedule B required?** (schdbind): 
  * Required for organizations that receive certain levels of contributions. It provides detailed information about donors who contribute significant amounts to the organization.
* **Schedule J required?** (rptyestocompnstncd): 
  * Used if the organization compensates its officers, directors, trustees, key employees, highest compensated employees, and independent contractors. It provides details on compensation practices and policies.

In [8]:
head_columns = ['ein', 'compltschocd','schdbind','rptyestocompnstncd']
#display_head(form_990_2022, head_columns, "Schedules Information Columns, Unique Values and Missing Data Counts:")

unique_columns = [ 'compltschocd','schdbind','rptyestocompnstncd']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| Statistics   | compltschocd   | schdbind   | rptyestocompnstncd   |
|:-------------|:---------------|:-----------|:---------------------|
| count        | 302514         | 302544     | 302537               |
| unique       | 2              | 2          | 2                    |
| top          | Y              | Y          | N                    |
| freq         | 275980         | 154082     | 238416               |

| column name        |   missing data points |
|:-------------------|----------------------:|
| compltschocd       |                    54 |
| schdbind           |                    24 |
| rptyestocompnstncd |                    31 |

## 4.3. Financial Information
   
The following are columns that may be helpful in analyzing financial information for tax exempt organizations. 

### 4.3.1. Expenses
* **Accounting fees** (accntingfees): 
  * High fees may indicate rigorous financial oversight and auditing, contributing to transparency.
* **Compensation of current officers, directors, etc.** (compnsatncurrofcr): 
  * Indicates how much of the organization's budget is allocated to leadership.
* **Compensation of disqualified persons** (compnsatnandothr): 
  * Helps identify potential conflicts of interest or self-dealing, which are critical for evaluating ethical practices.
* **Fundraising expenses** (lessdirfndrsng): 
  * Reveals the efficiency of fundraising efforts if compared to fundraising income. 
* **Interest expense** (interestamt):  
  * Insights into the organization's debt levels can inform assessments of financial health and sustainability.
* **Legal fees** (legalfees): 
  * Elevated fees may be associated with compliance, litigation, or regulatory challenges.
* **Management fees** (feesforsrvcmgmt):
  * Can provide an understanding of how much is spent on external management services.
* **Travel/entertainment expenses to public officials** (travelofpublicoffcl):
  * Spending in this area might reflect lobbying or advocacy efforts.
* **Lobbying fees** (feesforsrvclobby): 
  * Indicates the organization's investment in influencing public policy.

In [12]:
head_columns = ['ein', 'accntingfees','legalfees','feesforsrvcmgmt','interestamt']
display(Markdown('Operational Expenses Stats and Missing Data Counts:'))

unique_columns = [ 'accntingfees','legalfees','feesforsrvcmgmt','interestamt']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


Operational Expenses Stats and Missing Data Counts:

| Statistics   | accntingfees   | legalfees   | feesforsrvcmgmt   | interestamt   |
|:-------------|:---------------|:------------|:------------------|:--------------|
| count        | 302,567        | 302,567     | 302,567           | 302,567       |
| mean         | 16,464         | 24,147      | 84,727            | 109,351       |
| std          | 254,240        | 457,998     | 3,589,010         | 2,738,262     |
| min          | -250,080       | -2,509,474  | -1,661,700        | -36,829,924   |
| 25%          | 0              | 0           | 0                 | 0             |
| 50%          | 2,777          | 0           | 0                 | 0             |
| 75%          | 12,134         | 690         | 0                 | 9             |
| max          | 125,165,101    | 78,620,215  | 1,239,504,856     | 887,289,629   |

| column name     |   missing data points |
|:----------------|----------------------:|
| accntingfees    |                     1 |
| legalfees       |                     1 |
| feesforsrvcmgmt |                     1 |
| interestamt     |                     1 |

In [13]:
head_columns = ['ein', 'compnsatncurrofcr','compnsatnandothr']
#display_head(form_990_2022, head_columns, "Governance & Compliance Expenses Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Governance & Compliance Expenses Stats and Missing Data Counts:'))

unique_columns = [ 'compnsatncurrofcr','compnsatnandothr']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


Governance & Compliance Expenses Stats and Missing Data Counts:

| Statistics   | compnsatncurrofcr   | compnsatnandothr   |
|:-------------|:--------------------|:-------------------|
| count        | 302,567             | 302,567            |
| mean         | 148,922             | 18,745             |
| std          | 1,035,575           | 956,231            |
| min          | -255,394            | -104,000           |
| 25%          | 0                   | 0                  |
| 50%          | 0                   | 0                  |
| 75%          | 92,800              | 0                  |
| max          | 290,936,676         | 348,932,907        |

| column name       |   missing data points |
|:------------------|----------------------:|
| compnsatncurrofcr |                     1 |
| compnsatnandothr  |                     1 |

In [14]:
head_columns = ['ein', 'lessdirfndrsng','feesforsrvclobby','travelofpublicoffcl']
#display_head(form_990_2022, head_columns, "Advocacy and Development Expenses Expenses Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Advocacy and Development Expenses Stats and Missing Data Counts:'))

unique_columns = [ 'lessdirfndrsng','feesforsrvclobby','travelofpublicoffcl']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


Advocacy and Development Expenses Stats and Missing Data Counts:

| Statistics   | lessdirfndrsng   | feesforsrvclobby   | travelofpublicoffcl   |
|:-------------|:-----------------|:-------------------|:----------------------|
| count        | 302,567          | 302,567            | 302,567               |
| mean         | 10,229           | 3,786              | 76                    |
| std          | 115,939          | 248,544            | 4,351                 |
| min          | -342,167         | -522               | -1,422                |
| 25%          | 0                | 0                  | 0                     |
| 50%          | 0                | 0                  | 0                     |
| 75%          | 0                | 0                  | 0                     |
| max          | 28,140,778       | 130,670,613        | 1,442,748             |

| column name         |   missing data points |
|:--------------------|----------------------:|
| lessdirfndrsng      |                     1 |
| feesforsrvclobby    |                     1 |
| travelofpublicoffcl |                     1 |

### 4.3.2 Assets and Liabilities
* Intangible assets -- eoy
* Total assets -- eoy
* Total liabilities -- eoy
* Total Liabilities + Net Assets -- eoy
* Total Net Assets -- eoy
* Land, buildings, & equipment (net) -- eoy
* Investments in other securities -- eoy
* Investments in publicly traded securities -- eoy

In [15]:
head_columns = ['ein', 'intangibleassetsend','totassetsend','totnetassetend','lndbldgsequipend']
#display_head(form_990_2022, head_columns, "Assets Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Assets Stats and Missing Data Counts:'))

unique_columns = [ 'intangibleassetsend','totassetsend','totnetassetend','lndbldgsequipend']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


Assets Stats and Missing Data Counts:

| Statistics   | intangibleassetsend   | totassetsend    | totnetassetend   | lndbldgsequipend   |
|:-------------|:----------------------|:----------------|:-----------------|:-------------------|
| count        | 302,567               | 302,567         | 302,567          | 302,567            |
| mean         | 95,511                | 25,333,061      | 14,947,363       | 4,853,755          |
| std          | 6,902,728             | 488,396,351     | 329,075,888      | 74,485,565         |
| min          | -966,054              | -18,791,913     | -6,988,396,841   | -3,191,287         |
| 25%          | 0                     | 267,718         | 170,890          | 0                  |
| 50%          | 0                     | 842,983         | 621,358          | 22,239             |
| 75%          | 0                     | 3,348,143       | 2,342,776        | 611,870            |
| max          | 3,194,166,742         | 109,337,000,000 | 65,568,618,577   | 18,734,223,221     |

| column name         |   missing data points |
|:--------------------|----------------------:|
| intangibleassetsend |                     1 |
| totassetsend        |                     1 |
| totnetassetend      |                     1 |
| lndbldgsequipend    |                     1 |

In [16]:
head_columns = ['ein', 'totliabend','totnetliabastend']
#display_head(form_990_2022, head_columns, "Liabilities Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Liabilities Stats and Missing Data Counts:'))

unique_columns = [ 'totliabend','totnetliabastend']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


Liabilities Stats and Missing Data Counts:

| Statistics   | totliabend     | totnetliabastend   |
|:-------------|:---------------|:-------------------|
| count        | 302,567        | 302,567            |
| mean         | 10,385,678     | 25,333,032         |
| std          | 270,363,350    | 488,396,353        |
| min          | -9,410,630     | -18,791,913        |
| 25%          | 0              | 267,750            |
| 50%          | 35,737         | 843,003            |
| 75%          | 449,692        | 3,348,206          |
| max          | 98,638,683,146 | 109,337,000,000    |

| column name      |   missing data points |
|:-----------------|----------------------:|
| totliabend       |                     1 |
| totnetliabastend |                     1 |

In [17]:
head_columns = ['ein', 'invstmntsothrend','rptinvstothsecd','invstmntsend']
#display_head(form_990_2022, head_columns, "Investments Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Investments Stats and Missing Data Counts:'))

unique_columns = [ 'invstmntsothrend','rptinvstothsecd','invstmntsend']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

Investments Stats and Missing Data Counts:

| Statistics   | invstmntsothrend   | invstmntsend   |
|:-------------|:-------------------|:---------------|
| count        | 302,567            | 302,567        |
| mean         | 4,607,963          | 6,691,181      |
| std          | 193,548,972        | 205,204,512    |
| min          | -126,971,236       | -2,107,316     |
| 25%          | 0                  | 0              |
| 50%          | 0                  | 0              |
| 75%          | 0                  | 0              |
| max          | 49,780,087,314     | 49,082,798,623 |

| column name      |   missing data points |
|:-----------------|----------------------:|
| invstmntsothrend |                     1 |
| rptinvstothsecd  |                    27 |
| invstmntsend     |                     1 |

### 4.3.3 Revenue
* Program service revenue
* Total revenue
* Investment income
* Fundraising income
* Income from sales of inventory
* Net UBI (170)
* Net UBI (509)
* Gross income from interest etc (170)
* Gross income from interest etc (509)
* Gross income from members
* Gross income from other sources
* Gross receipts amount
* Gross receipts from related activities (170)
* Gross receipts from related activities (509)
* Gross rents -- Real estate

In [18]:
head_columns = ['ein', 'totprgmrevnue','grsrcptsrelated170','grsrcptsactivities509']
#display_head(form_990_2022, head_columns, "Direct Mission-Related Revenue Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Direct Mission-Related Revenue Stats and Missing Data Counts:'))

unique_columns = [ 'totprgmrevnue','grsrcptsrelated170','grsrcptsactivities509']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

Direct Mission-Related Revenue Stats and Missing Data Counts:

| Statistics   | totprgmrevnue   | grsrcptsrelated170   | grsrcptsactivities509   |
|:-------------|:----------------|:---------------------|:------------------------|
| count        | 302,567         | 302,567              | 302,567                 |
| mean         | 7,537,612       | 4,874,712            | 63,710                  |
| std          | 177,617,550     | 304,874,226          | 10,532,675              |
| min          | -16,997,704     | -1,569,177,171       | -4,112,488              |
| 25%          | 0               | 0                    | 0                       |
| 50%          | 41,433          | 0                    | 0                       |
| 75%          | 434,256         | 0                    | 0                       |
| max          | 68,337,728,116  | 114,560,000,000      | 5,596,470,578           |

| column name           |   missing data points |
|:----------------------|----------------------:|
| totprgmrevnue         |                     1 |
| grsrcptsrelated170    |                     1 |
| grsrcptsactivities509 |                     1 |

In [19]:
head_columns = ['ein', 'invstmntinc','grsinc170','grsinc509']
#display_head(form_990_2022, head_columns, "Supplementary and Passive Income Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Supplementary and Passive Income Stats and Missing Data Counts:'))

unique_columns = [ 'invstmntinc','grsinc170','grsinc509']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

Supplementary and Passive Income Stats and Missing Data Counts:

| Statistics   | invstmntinc   | grsinc170     | grsinc509   |
|:-------------|:--------------|:--------------|:------------|
| count        | 302,567       | 302,567       | 302,567     |
| mean         | 221,372       | 351,596       | 74,561      |
| std          | 7,620,701     | 12,351,410    | 2,371,957   |
| min          | -67,276,661   | -40,823,708   | -4,168,671  |
| 25%          | 0             | 0             | 0           |
| 50%          | 190           | 0             | 0           |
| 75%          | 9,676         | 11            | 0           |
| max          | 1,957,907,109 | 2,550,388,301 | 663,357,624 |

| column name   |   missing data points |
|:--------------|----------------------:|
| invstmntinc   |                     1 |
| grsinc170     |                     1 |
| grsinc509     |                     1 |

In [20]:
head_columns = ['ein', 'netincfndrsng','grsincmembers','grsincother']
#display_head(form_990_2022, head_columns, "Fundraising and Non-Mission Income Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Fundraising and Non-Mission Stats and Missing Data Counts:'))

unique_columns = [ 'netincfndrsng','grsincmembers','grsincother']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

Fundraising and Non-Mission Stats and Missing Data Counts:

| Statistics   | netincfndrsng   | grsincmembers   | grsincother   |
|:-------------|:----------------|:----------------|:--------------|
| count        | 302,567         | 302,567         | 302,567       |
| mean         | 6,396           | 230,625         | 8,543         |
| std          | 87,257          | 10,323,608      | 619,705       |
| min          | -13,121,659     | 0               | -25,005,000   |
| 25%          | 0               | 0               | 0             |
| 50%          | 0               | 0               | 0             |
| 75%          | 0               | 0               | 0             |
| max          | 14,362,240      | 3,706,190,064   | 179,381,153   |

| column name   |   missing data points |
|:--------------|----------------------:|
| netincfndrsng |                     1 |
| grsincmembers |                     1 |
| grsincother   |                     1 |

In [21]:
head_columns = ['ein', 'netincunreltd170','unreltxincls511tx509','netincsales','grsrntsreal']
#display_head(form_990_2022, head_columns, "Unrelated Business and Other Activities Sample View, Summary Statistics and Missing Data Counts:")
display(Markdown('Unrelated Business and Other Activities Stats and Missing Data Counts:'))

unique_columns = [ 'netincunreltd170','unreltxincls511tx509','netincsales','grsrntsreal']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

Unrelated Business and Other Activities Stats and Missing Data Counts:

| Statistics   | netincunreltd170   | unreltxincls511tx509   | netincsales   | grsrntsreal   |
|:-------------|:-------------------|:-----------------------|:--------------|:--------------|
| count        | 302,567            | 302,567                | 302,567       | 302,567       |
| mean         | 7,559              | 3,491                  | 35,075        | 33,715        |
| std          | 292,738            | 961,340                | 1,618,071     | 952,833       |
| min          | -16,604,762        | -1,124,379             | -47,654,734   | -3,986,501    |
| 25%          | 0                  | 0                      | 0             | 0             |
| 50%          | 0                  | 0                      | 0             | 0             |
| 75%          | 0                  | 0                      | 0             | 0             |
| max          | 76,570,391         | 480,662,670            | 690,476,299   | 238,594,235   |

| column name          |   missing data points |
|:---------------------|----------------------:|
| netincunreltd170     |                     1 |
| unreltxincls511tx509 |                     1 |
| netincsales          |                     1 |
| grsrntsreal          |                     1 |

### 4.3.4 Contributions and Support
* Total contributions
* Non-deductible contributions
* Gifts grants membership fees received (170)
* Gifts grants membership fees received (509)
* Public support (170)
* Public support (509)
* Total support (170)
* Total support (509)


In [25]:
head_columns = ['ein', 'totcntrbgfts','solicitcntrbcd','gftgrntsrcvd170','totgftgrntrcvd509','pubsupplesspct170','pubsupplesub509','totsupp170','totsupp509']

unique_columns = [ 'totcntrbgfts','solicitcntrbcd','gftgrntsrcvd170','totgftgrntrcvd509','pubsupplesspct170','pubsupplesub509','totsupp170','totsupp509']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

| Statistics   | totcntrbgfts   | gftgrntsrcvd170   | totgftgrntrcvd509   | pubsupplesspct170   | pubsupplesub509   | totsupp170     | totsupp509      |
|:-------------|:---------------|:------------------|:--------------------|:--------------------|:------------------|:---------------|:----------------|
| count        | 302,567        | 302,567           | 302,567             | 302,567             | 302,567           | 302,567        | 302,567         |
| mean         | 2,520,504      | 6,863,245         | 874,030             | 6,232,202           | 6,646,222         | 7,391,403      | 6,959,749       |
| std          | 60,427,717     | 170,377,928       | 27,019,059          | 158,919,434         | 590,705,512       | 180,125,449    | 592,108,481     |
| min          | -3,862,804     | -34,940           | -58,558             | -35,469,055         | -10,414,479       | -34,940        | -10,414,479     |
| 25%          | 11,085         | 0                 | 0                   | 0                   | 0                 | 0              | 0               |
| 50%          | 171,486        | 0                 | 0                   | 0                   | 0                 | 0              | 0               |
| 75%          | 663,806        | 826,866           | 24,000              | 759,907             | 336,654           | 938,236        | 375,198         |
| max          | 16,367,570,660 | 50,435,033,482    | 9,495,915,494       | 49,167,605,373      | 308,910,000,000   | 53,061,992,174 | 308,954,000,000 |

| column name       |   missing data points |
|:------------------|----------------------:|
| totcntrbgfts      |                     1 |
| gftgrntsrcvd170   |                     1 |
| totgftgrntrcvd509 |                     1 |
| pubsupplesspct170 |                     1 |
| pubsupplesub509   |                     1 |
| totsupp170        |                     1 |
| totsupp509        |                     1 |

### 4.3.5 Grantmaking 
   
* Grants payable -- eoy
* Grants to governments/orgs in the US
* Grants to individuals in the US
* Grants to orgs and individuals outside the US

In [6]:
head_columns = ['ein', 'grntspayableend','grntstogovt','grnsttoindiv','grntstofrgngovt','totnooforgscnt']

unique_columns = [ 'grntspayableend','grntstogovt','grnsttoindiv','grntstofrgngovt','totnooforgscnt']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| Statistics   | grntspayableend   | grntstogovt    | grnsttoindiv   | grntstofrgngovt   | totnooforgscnt   |
|:-------------|:------------------|:---------------|:---------------|:------------------|:-----------------|
| count        | 302,567           | 302,567        | 302,567        | 302,567           | 302,567          |
| mean         | 71,383            | 538,179        | 278,081        | 158,665           | 0                |
| std          | 7,239,469         | 30,557,289     | 7,934,695      | 20,667,180        | 54               |
| min          | -1,014,299        | -65,000        | -2,800         | -1,955            | 0                |
| 25%          | 0                 | 0              | 0              | 0                 | 0                |
| 50%          | 0                 | 0              | 0              | 0                 | 0                |
| 75%          | 0                 | 0              | 0              | 0                 | 0                |
| max          | 3,655,040,890     | 10,014,763,351 | 2,166,472,459  | 8,362,101,007     | 29,614           |

| column name     |   missing data points |
|:----------------|----------------------:|
| grntspayableend |                     1 |
| grntstogovt     |                     1 |
| grnsttoindiv    |                     1 |
| grntstofrgngovt |                     1 |
| totnooforgscnt  |                     1 |

## 4.4 Organizational & Operational Details
   
* Number of employees
* Number individuals greater than $100K
* Number of contractors greater than $100K
* Number of organizations supported
* Occupancy
* Travel
* Conferences, conventions, meetings
* Foreign office?

In [7]:
head_columns = ['ein', 'noemplyeesw3cnt','noindiv100kcnt','occupancy','travel','converconventmtng','frgnofficecd','totnooforgscnt']

unique_columns = [ 'noemplyeesw3cnt','noindiv100kcnt','occupancy','travel','converconventmtng','frgnofficecd','totnooforgscnt']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| Statistics   | noemplyeesw3cnt   | noindiv100kcnt   | occupancy   | travel      | converconventmtng   | totnooforgscnt   |
|:-------------|:------------------|:-----------------|:------------|:------------|:--------------------|:-----------------|
| count        | 302,567           | 302,567          | 302,567     | 302,567     | 302,567             | 302,567          |
| mean         | 67                | 6                | 243,348     | 23,757      | 18,764              | 0                |
| std          | 1,302             | 363              | 3,286,609   | 483,047     | 232,857             | 54               |
| min          | 0                 | 0                | -19,431,500 | -142,888    | -981,444            | 0                |
| 25%          | 0                 | 0                | 0           | 0           | 0                   | 0                |
| 50%          | 2                 | 0                | 8,827       | 0           | 0                   | 0                |
| 75%          | 16                | 0                | 54,798      | 3,518       | 1,036               | 0                |
| max          | 520,474           | 137,288          | 502,675,626 | 172,465,765 | 34,382,682          | 29,614           |

| column name       |   missing data points |
|:------------------|----------------------:|
| noemplyeesw3cnt   |                     1 |
| noindiv100kcnt    |                     1 |
| occupancy         |                     1 |
| travel            |                     1 |
| converconventmtng |                     1 |
| frgnofficecd      |                    26 |
| totnooforgscnt    |                     1 |

## 4.5 Governance and Relationships
   
* Business relationship thru family member?
* Business relationship with organization?
* Grant to related person?
* Related entity?
* Receivables from disqualified persons -- eoy
* Receivables from officers, directors, etc. -- eoy
* Loan to officer or DQP?

In [9]:
head_columns = ['ein', 'fmlybusnreltdcd','dirbusnreltdcd','grantoofficercd','reltdorgcd','rcvbldisqualend','currfrmrcvblend','loantofficercd']

unique_columns = [ 'rcvbldisqualend','currfrmrcvblend']
display_stats(form_990_2022, unique_columns)

unique_columns = [ 'fmlybusnreltdcd','dirbusnreltdcd','grantoofficercd','reltdorgcd','loantofficercd']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)


| Statistics   | rcvbldisqualend   | currfrmrcvblend   |
|:-------------|:------------------|:------------------|
| count        | 302,567           | 302,567           |
| mean         | 2,067             | 9,539             |
| std          | 502,628           | 467,460           |
| min          | -953,698          | -175,117          |
| 25%          | 0                 | 0                 |
| 50%          | 0                 | 0                 |
| 75%          | 0                 | 0                 |
| max          | 263,752,773       | 132,421,382       |

| Statistics   | fmlybusnreltdcd   | dirbusnreltdcd   | grantoofficercd   | reltdorgcd   | loantofficercd   |
|:-------------|:------------------|:-----------------|:------------------|:-------------|:-----------------|
| count        | 302532            | 302540           | 302538            | 302540       | 302542           |
| unique       | 2                 | 2                | 2                 | 2            | 2                |
| top          | N                 | N                | N                 | N            | N                |
| freq         | 296747            | 295510           | 301405            | 234251       | 294695           |

| column name     |   missing data points |
|:----------------|----------------------:|
| fmlybusnreltdcd |                    36 |
| dirbusnreltdcd  |                    28 |
| grantoofficercd |                    30 |
| reltdorgcd      |                    28 |
| loantofficercd  |                    26 |

## 4.6 Regulatory Compliance and Activities
   
* Lobbying activities?
* Political activities?
* Foreign activities, etc?
* More than $5000 to individuals Part IX, line 3?
* More than $5000 to organizations Part IX, line 3?

In [10]:
head_columns = ['ein', 'lbbyingactvtscd','politicalactvtscd','frgnrevexpnscd','frgnaggragrntscd','frgngrntscd']

unique_columns = [ 'lbbyingactvtscd','politicalactvtscd','frgnrevexpnscd','frgnaggragrntscd','frgngrntscd']
display_stats(form_990_2022, unique_columns)

display_missing_cts(form_990_2022, unique_columns)

| Statistics   | lbbyingactvtscd   | politicalactvtscd   | frgnrevexpnscd   | frgnaggragrntscd   | frgngrntscd   |
|:-------------|:------------------|:--------------------|:-----------------|:-------------------|:--------------|
| count        | 234533            | 302543              | 302535           | 302547             | 302545        |
| unique       | 2                 | 2                   | 2                | 2                  | 2             |
| top          | N                 | N                   | N                | N                  | N             |
| freq         | 223202            | 299678              | 289995           | 299872             | 294259        |

| column name       |   missing data points |
|:------------------|----------------------:|
| lbbyingactvtscd   |                 68035 |
| politicalactvtscd |                    25 |
| frgnrevexpnscd    |                    33 |
| frgnaggragrntscd  |                    21 |
| frgngrntscd       |                    23 |