# INFO 3401 

## Background

Representatives elected to the U.S. House are given budgets to hire staff and to run their offices in Washington, D.C. as well as in their home district. Senior leadership offices and committees also receive budgets to hire their own staff and conduct their business. These funds cannot be used for personal or campaign expenses and there is no reserve fund if members run over budget. These budgets vary across members and committees  and depend on a variety of factors. Detailed records of these disbursements are [published quarterly as CSV files](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/archive) going back to 2010.

Make sure to also check out the [Details](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/details), [FAQ](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/frequently-asked-questions), [Glossary](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/glossary-of-terms), and [Transaction Codes](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/transaction-codes).

For more optional background, also check out the [cleaned data](https://projects.propublica.org/represent/expenditures), [blog posts](https://www.propublica.org/article/update-on-house-disbursements-a-few-notes-on-how-to-use-the-data), [training resources](https://www.propublica.org/documents/item/3230540-75012825-House-Disbursements-Reports-Training.html), and stories about [budget reductions](https://www.propublica.org/article/house-operating-budget-cuts-paving-way-for-more-special-interest-influence) and [staff turnover](https://www.propublica.org/article/turnover-in-the-house-who-keeps-and-who-loses-the-most-staff) the ProPublica/Sunlight Foundation publishes about these data.



In [2]:
import pandas as pd
import numpy as np

## Formulate a question

In this notebook, we will examine the following question: 

What do the House members who spend the most on travel have in common?

## Step 2: Read in the data

Loading in the 2019 Q3 "SOD Detail Transactions" data from the offical House site and assigning it to `h20q1_df`.

`read_csv` function:
* **encoding** You may get a character encoding error, pass 'latin1' to "encoding" parameter.
* **parse_dates** Pass "TRANSACTION DATE", "PERFORM START DT", and "PERFORM END DT" as a list to the "parse_dates" parameter.

In [3]:
def read_csv(filename, encoding, parse_dates):
    h20q1_df = pd.read_csv(filename,encoding = encoding,parse_dates=parse_dates)
#     print(type(h20q1_df))
#     h20q1_df = pd.to_datetime(h20q1_df,format = "%Y%m/%d/")
#     for parse in parse_dates:
#         for date in h20q1_df[parse]:
#             print(date)
#     return h20q1_df.drop(parse_dates, axis=1)
    return h20q1_df
    
# h20q1_df

In [4]:
filename = "JAN-MAR 2019 SOD DETAIL GRID.CSV"
h20q1_df = read_csv(filename,'utf_8',["TRANSACTION DATE", "PERFORM START DT", "PERFORM END DT"])

In [5]:
h20q1_df

Unnamed: 0,ORGANIZATION,PROGRAM,SORT SUBTOTAL DESCRIPTION,SORT SEQUENCE,TRANSACTION DATE,DATA SOURCE,DOCUMENT,VENDOR NAME,PERFORM START DT,PERFORM END DT,DESCRIPTION,AMOUNT
0,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,DETAIL,28-Feb-19,AP,1087620,UNITED STATES POSTAL SERVICE,3-Jan-19,31-Jan-19,FRANKED MAIL,32.76
1,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,DETAIL,28-Mar-19,AP,1099415,UNITED STATES POSTAL SERVICE,1-Feb-19,28-Feb-19,FRANKED MAIL,31.43
2,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,SUBTOTAL,,,,,,,FRANKED MAIL TOTALS:,64.19
3,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,ANDROFF BLAKE J,3-Jan-19,31-Mar-19,EXC DIR DEM POL & COMM CMTE,40333.33
4,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,BELTRAN ELIZABETH R,20-Feb-19,31-Mar-19,STAFF ASSISTANT,3701.39
...,...,...,...,...,...,...,...,...,...,...,...,...
118343,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,13-Feb-19,AP,1076281,BEARCOM,1-Feb-19,28-Feb-19,MAINTENANCE / REPAIRS,6405.41
118344,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,7-Mar-19,AP,1088092,BEARCOM,1-Mar-19,31-Mar-19,MAINTENANCE / REPAIRS,6405.41
118345,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,SUBTOTAL,,,,,,,EQUIPMENT TOTALS:,38432.46
118346,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,SUBTOTAL,,,,,,,PAGING TOTALS:,38432.46


## Step 3: Check the data

How many rows and columns are in `h20q1_df`?

In [6]:
# 
rows, columns = h20q1_df.shape
print("there are",rows,"rows and",columns,"columns")

there are 118348 rows and 12 columns


## Step 4: Investigate the data

I wanted to gain a better sense of what the data actually means. I did this by inspecting the first 10 and last 10 rows of data and investigating the [Details](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/details), [FAQ](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/frequently-asked-questions), [Glossary](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/glossary-of-terms), and [Transaction Codes](https://www.house.gov/the-house-explained/open-government/statement-of-disbursements/transaction-codes) on the dataset website.


In [7]:
h20q1_df.head(10)


Unnamed: 0,ORGANIZATION,PROGRAM,SORT SUBTOTAL DESCRIPTION,SORT SEQUENCE,TRANSACTION DATE,DATA SOURCE,DOCUMENT,VENDOR NAME,PERFORM START DT,PERFORM END DT,DESCRIPTION,AMOUNT
0,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,DETAIL,28-Feb-19,AP,1087620.0,UNITED STATES POSTAL SERVICE,3-Jan-19,31-Jan-19,FRANKED MAIL,32.76
1,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,DETAIL,28-Mar-19,AP,1099415.0,UNITED STATES POSTAL SERVICE,1-Feb-19,28-Feb-19,FRANKED MAIL,31.43
2,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,SUBTOTAL,,,,,,,FRANKED MAIL TOTALS:,64.19
3,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,ANDROFF BLAKE J,3-Jan-19,31-Mar-19,EXC DIR DEM POL & COMM CMTE,40333.33
4,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,BELTRAN ELIZABETH R,20-Feb-19,31-Mar-19,STAFF ASSISTANT,3701.39
5,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,BELTRAN ELIZABETH R,20-Feb-19,28-Feb-19,STAFF ASSISTANT (OVERTIME),222.65
6,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,BERRET EMILY C,3-Jan-19,31-Mar-19,DIR OF OPERATIONS & ADVISOR,28611.12
7,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,CAPRON MARGARET W.,3-Jan-19,28-Feb-19,SENIOR ADV POLICY & COMM,22679.52
8,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,CAPRON MARGARET W.,1-Mar-19,31-Mar-19,SENIOR ADV FOR POLICY & COMM,11891.42
9,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,CHERRY STEPHANIE,3-Jan-19,31-Mar-19,DIRECTOR OF MEDIA AFFAIRS,21222.22


In [10]:
h20q1_df.tail(10)

Unnamed: 0,ORGANIZATION,PROGRAM,SORT SUBTOTAL DESCRIPTION,SORT SEQUENCE,TRANSACTION DATE,DATA SOURCE,DOCUMENT,VENDOR NAME,PERFORM START DT,PERFORM END DT,DESCRIPTION,AMOUNT
118338,FISCAL YEAR 2019 CHILD CARE CTR,CHILD CARE CTR,SUPPLIES AND MATERIALS,GRAND TOTAL FOR ORGANIZATION,,,,,,,OFFICE TOTALS:,449.1
118339,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,22-Jan-19,AP,1066487.0,BEARCOM,1-Jan-19,31-Jan-19,MAINTENANCE / REPAIRS,6405.41
118340,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,28-Jan-19,AP,1068947.0,BEARCOM,1-Oct-18,30-Oct-18,MAINTENANCE / REPAIRS,6405.41
118341,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,28-Jan-19,AP,1068951.0,BEARCOM,1-Nov-18,30-Nov-18,MAINTENANCE / REPAIRS,6405.41
118342,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,28-Jan-19,AP,1068989.0,BEARCOM,1-Dec-18,31-Dec-18,MAINTENANCE / REPAIRS,6405.41
118343,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,13-Feb-19,AP,1076281.0,BEARCOM,1-Feb-19,28-Feb-19,MAINTENANCE / REPAIRS,6405.41
118344,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,7-Mar-19,AP,1088092.0,BEARCOM,1-Mar-19,31-Mar-19,MAINTENANCE / REPAIRS,6405.41
118345,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,SUBTOTAL,,,,,,,EQUIPMENT TOTALS:,38432.46
118346,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,SUBTOTAL,,,,,,,PAGING TOTALS:,38432.46
118347,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,GRAND TOTAL FOR ORGANIZATION,,,,,,,OFFICE TOTALS:,38432.46


## Step 5: Check the "n"s and clean the data

Check the kinds of values that are possible for each of the "ORGANIZATION", "PROGRAM", "SORT SUBTOTAL DESCRIPTION", and "SORT SEQUENCE" columns, and how frequently they occur.

In [11]:
org_series = h20q1_df["ORGANIZATION"]
pro_series = h20q1_df["PROGRAM"]
sub_series = h20q1_df["SORT SUBTOTAL DESCRIPTION"]
seq_series = h20q1_df["SORT SEQUENCE"]


In [12]:
org_series.value_counts()

FISCAL YEAR 2019 GOVERNMENT CONTRIBUTIONS         6244
FISCAL YEAR 2019 CHIEF ADMIN OFCR OF THE HOUSE    2757
FISCAL YEAR 2019 CLERK OF THE HOUSE                791
FISCAL YEAR 2019 SERGEANT AT ARMS                  602
FISCAL YEAR 2018 CHIEF ADMIN OFCR OF THE HOUSE     492
                                                  ... 
2017 HON. KAREN C. HANDEL                            4
2017 HON. GLENN GROTHMAN                             4
2017 HON. BRAD SHERMAN                               4
2017 HON. JIM COOPER                                 4
2015 LEGISLATIVE COUNSEL                             4
Name: ORGANIZATION, Length: 1224, dtype: int64

In [13]:
pro_series.value_counts()

OFFICIAL EXPENSES OF MEMBERS      95905
GENERAL EXPENDITURES               6575
GOVERNMENT CONTRIBUTIONS           6296
SALARIES  OFFICERS & EMPLOYEES     1892
ADMIN AND OPS                      1675
                                  ...  
SUBSCRIPTIONS                         3
PBX SWITCH MAINTENANCE                3
ENTERPRISE STORAGE SYSTEMS            3
FEDERAL OFFICE BUILDING 8             3
OVERSEAS TRVL CAP POLICE REIMB        3
Name: PROGRAM, Length: 108, dtype: int64

In [14]:
sub_series.value_counts()

TRAVEL                            25993
PERSONNEL COMPENSATION            22881
SUPPLIES AND MATERIALS            21283
RENT  COMMUNICATION  UTILITIES    20328
OTHER SERVICES                     7106
PERSONNEL BENEFITS                 6347
EQUIPMENT                          6283
PRINTING AND REPRODUCTION          4734
FRANKED MAIL                       3339
TRANSPORTATION OF THINGS             49
BENEFITS TO FORMER PERSONNEL          5
Name: SORT SUBTOTAL DESCRIPTION, dtype: int64

In [15]:
seq_series.value_counts()
# print(len(seq_series))

DETAIL                          108131
SUBTOTAL                          8984
GRAND TOTAL FOR ORGANIZATION      1233
Name: SORT SEQUENCE, dtype: int64

I used Boolean indexing to remove the rows of data corresponding to "SUBTOTAL" and "GRAND TOTAL FOR ORGANIZATION" under "SORT SEQUENCE" since these are duplicates/aggregations. I created a new variable called `h20q1_df` with rows that all have 'SORT SEQUENCE' equal to "DETAIL"

In [16]:
boolean_sub = (seq_series == "DETAIL")
# boolean_sub
h20q1_df = h20q1_df[boolean_sub]
h20q1_df

Unnamed: 0,ORGANIZATION,PROGRAM,SORT SUBTOTAL DESCRIPTION,SORT SEQUENCE,TRANSACTION DATE,DATA SOURCE,DOCUMENT,VENDOR NAME,PERFORM START DT,PERFORM END DT,DESCRIPTION,AMOUNT
0,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,DETAIL,28-Feb-19,AP,1087620,UNITED STATES POSTAL SERVICE,3-Jan-19,31-Jan-19,FRANKED MAIL,32.76
1,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,FRANKED MAIL,DETAIL,28-Mar-19,AP,1099415,UNITED STATES POSTAL SERVICE,1-Feb-19,28-Feb-19,FRANKED MAIL,31.43
3,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,ANDROFF BLAKE J,3-Jan-19,31-Mar-19,EXC DIR DEM POL & COMM CMTE,40333.33
4,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,BELTRAN ELIZABETH R,20-Feb-19,31-Mar-19,STAFF ASSISTANT,3701.39
5,2019 OFFICE OF THE SPEAKER,GENERAL EXPENDITURES,PERSONNEL COMPENSATION,DETAIL,,,,BELTRAN ELIZABETH R,20-Feb-19,28-Feb-19,STAFF ASSISTANT (OVERTIME),222.65
...,...,...,...,...,...,...,...,...,...,...,...,...
118340,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,28-Jan-19,AP,1068947,BEARCOM,1-Oct-18,30-Oct-18,MAINTENANCE / REPAIRS,6405.41
118341,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,28-Jan-19,AP,1068951,BEARCOM,1-Nov-18,30-Nov-18,MAINTENANCE / REPAIRS,6405.41
118342,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,28-Jan-19,AP,1068989,BEARCOM,1-Dec-18,31-Dec-18,MAINTENANCE / REPAIRS,6405.41
118343,FISCAL YEAR 2018 PAGING,PAGING,EQUIPMENT,DETAIL,13-Feb-19,AP,1076281,BEARCOM,1-Feb-19,28-Feb-19,MAINTENANCE / REPAIRS,6405.41


## Step 6: Validating the data

"TRAVEL" makes up one of the most frequent expenses under "SORT SUBTOTAL DESCRIPTION". I Use Boolean indexing to create a new DataFrame called `travel_df` that only contains "TRAVEL" from "SORT SUBTOTAL DESCRIPTION". From this I was able to get the average value of "AMOUNT" in `travel_df` 
- I needed to rename the "AMOUNT" column because the dataset had spaces in the column name which name it difficult to perform boolean substitution on. 

In [17]:
travel_df = h20q1_df[(sub_series == "TRAVEL")]
travel_df.columns

  travel_df = h20q1_df[(sub_series == "TRAVEL")]


Index(['ORGANIZATION', 'PROGRAM', 'SORT SUBTOTAL DESCRIPTION', 'SORT SEQUENCE',
       'TRANSACTION DATE', 'DATA SOURCE', 'DOCUMENT', 'VENDOR NAME',
       'PERFORM START DT', 'PERFORM END DT', 'DESCRIPTION',
       'AMOUNT                                                                                                                                           '],
      dtype='object')

In [18]:
# renaming columns
cols = travel_df.columns
col_a = cols[-1] 
travel_df = travel_df.rename(columns = {col_a:'DOLLARS'})
travel_df.columns

Index(['ORGANIZATION', 'PROGRAM', 'SORT SUBTOTAL DESCRIPTION', 'SORT SEQUENCE',
       'TRANSACTION DATE', 'DATA SOURCE', 'DOCUMENT', 'VENDOR NAME',
       'PERFORM START DT', 'PERFORM END DT', 'DESCRIPTION', 'DOLLARS'],
      dtype='object')

In [19]:
dol_mean = travel_df["DOLLARS"].mean()
print("The average dollar amount for each expense is $"+ str(round(dol_mean,2)))

The average dollar amount for each expense is $266.38


**Judging from the description field, think that this value represents the avgerage value of all expenses in the dataset. each row is an example of one expense. I think this is reasonable because some expenses like mail won't be very expense whereas maintenance repairs will be much higher.**

## Pivot Table

Next I made a pivot table with `travel_df` with "ORGANIZATION" as an index, "DESCRIPTION" as columns, "AMOUNT" as values, and 'sum' as an aggfunc.

In [25]:
travel_table = travel_df.pivot_table(index=['ORGANIZATION'], columns =['DESCRIPTION'], values="DOLLARS", aggfunc=sum).reset_index()
# sorted_table = travel_table.order(ascending=False)
travel_table

DESCRIPTION,ORGANIZATION,AUTOMOBILE LEASE,CAR RENTAL,COMMERCIAL TRANSPORTATION,CONSULT TRAVEL / RELATED EXP,FIELD HEARING SUPPORT COST,GASOLINE,LODGING,MEALS,MISCELLANEOUS TRAVEL,PRIVATE AUTO MILEAGE,TAXI/PARKING/TOLLS,WITNESS TRAVEL / RELATED EXP
0,2015 HON. DOUG LAMBORN,,,,,,,,52.83,,,50.00,
1,2015 OTHER ADMINISTRATION,,,50.00,,,,,,,789.62,,
2,2016 COMMITTEE ON NATURAL RESOURCES,,,0.80,,,,,,,,,
3,2016 HON. CHRIS STEWART,,-17.92,,,,,,,,,,
4,2016 HON. DOUG LAMBORN,,,,,,,,10.98,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
965,FISCAL YEAR 2019 OFFICE OF ATTENDING PHYSICIAN,,,,,,,,,,,72.00,
966,FISCAL YEAR 2019 OFFICE OF CONGRESSIONAL ETHICS,,,528.40,,,,2211.03,765.99,702.95,151.52,330.99,
967,FISCAL YEAR 2019 OFFICE OF GENERAL COUNSEL,,126.84,443.59,,,12.50,208.25,80.11,,,55.09,
968,FISCAL YEAR 2019 OFFICE OF INSPECTOR GENERAL,,,,,,,,,,45.24,192.00,


Sorting the resulting table in descending order by different columns like "COMMERCIAL TRANSPORTATION", "MEALS", "PRIVATE AUTO MILEAGE".

In [26]:
# meals_series = travel_table["MEALS"].sort_values(ascending=False)
# meals_series.dropna()

m_table = travel_table.sort_values(by=['MEALS'],ascending=False)
ct_table = travel_table.sort_values(by=['COMMERCIAL TRANSPORTATION'],ascending=False)
pam_table = travel_table.sort_values(by=['PRIVATE AUTO MILEAGE'],ascending=False)

m_table
# print(travel_table["MEALS"])

DESCRIPTION,ORGANIZATION,AUTOMOBILE LEASE,CAR RENTAL,COMMERCIAL TRANSPORTATION,CONSULT TRAVEL / RELATED EXP,FIELD HEARING SUPPORT COST,GASOLINE,LODGING,MEALS,MISCELLANEOUS TRAVEL,PRIVATE AUTO MILEAGE,TAXI/PARKING/TOLLS,WITNESS TRAVEL / RELATED EXP
958,FISCAL YEAR 2019 CHIEF ADMIN OFCR OF THE HOUSE,,3661.48,26186.99,,,247.39,13261.07,4342.46,102.00,2614.25,1841.59,
397,2018 HON. ROGER W. MARSHALL,,3131.75,4217.30,,,422.82,4255.75,4125.95,,1042.50,349.16,
948,FISCAL YEAR 2016 CHIEF ADMIN OFCR OF THE HOUSE,,1487.84,8914.31,,,309.64,13620.36,3157.72,60.00,20004.90,1032.45,
964,FISCAL YEAR 2019 NEW MEMBER ORIENTATION,,,160347.31,,,,432385.53,2795.60,30.00,1389.20,4338.11,
311,2018 HON. MADELEINE Z. BORDALLO,,3584.20,30012.40,,,448.13,15270.64,2504.50,413.75,,1781.62,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
957,FISCAL YEAR 2019 BROADCAST SERVICES,,,,,,,,,,249.75,,
960,FISCAL YEAR 2019 COMMUNICATIONS,,,,,,1858.40,,,,,,
963,FISCAL YEAR 2019 MISCELLANEOUS AUTOMOBILES,29321.67,,,,,3131.88,,,,,,
965,FISCAL YEAR 2019 OFFICE OF ATTENDING PHYSICIAN,,,,,,,,,,,72.00,


In [23]:
ct_table


DESCRIPTION,ORGANIZATION,AUTOMOBILE LEASE,CAR RENTAL,COMMERCIAL TRANSPORTATION,CONSULT TRAVEL / RELATED EXP,FIELD HEARING SUPPORT COST,GASOLINE,LODGING,MEALS,MISCELLANEOUS TRAVEL,PRIVATE AUTO MILEAGE,TAXI/PARKING/TOLLS,WITNESS TRAVEL / RELATED EXP
964,FISCAL YEAR 2019 NEW MEMBER ORIENTATION,,,160347.31,,,,432385.53,2795.60,30.00,1389.20,4338.11,
510,2019 HON. AL GREEN,1682.49,,30107.24,,,50.00,1679.93,141.78,,47.56,98.31,
311,2018 HON. MADELEINE Z. BORDALLO,,3584.20,30012.40,,,448.13,15270.64,2504.50,413.75,,1781.62,
800,2019 HON. MICHAEL F.Q. SAN NICOLAS,,,27598.54,,,,4031.11,1462.43,,,1259.42,
958,FISCAL YEAR 2019 CHIEF ADMIN OFCR OF THE HOUSE,,3661.48,26186.99,,,247.39,13261.07,4342.46,102.00,2614.25,1841.59,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
960,FISCAL YEAR 2019 COMMUNICATIONS,,,,,,1858.40,,,,,,
961,FISCAL YEAR 2019 COMMUNICATIONS EQUIPMENT,,,,,,137.26,1882.14,1237.50,,1821.94,,
963,FISCAL YEAR 2019 MISCELLANEOUS AUTOMOBILES,29321.67,,,,,3131.88,,,,,,
965,FISCAL YEAR 2019 OFFICE OF ATTENDING PHYSICIAN,,,,,,,,,,,72.00,


In [24]:
pam_table

DESCRIPTION,ORGANIZATION,AUTOMOBILE LEASE,CAR RENTAL,COMMERCIAL TRANSPORTATION,CONSULT TRAVEL / RELATED EXP,FIELD HEARING SUPPORT COST,GASOLINE,LODGING,MEALS,MISCELLANEOUS TRAVEL,PRIVATE AUTO MILEAGE,TAXI/PARKING/TOLLS,WITNESS TRAVEL / RELATED EXP
948,FISCAL YEAR 2016 CHIEF ADMIN OFCR OF THE HOUSE,,1487.84,8914.31,,,309.64,13620.36,3157.72,60.0,20004.90,1032.45,
790,2019 HON. MARKWAYNE MULLIN,,,3581.80,,,,2650.11,384.40,,10169.16,325.95,
362,2018 HON. PAUL MITCHELL,,,708.59,,,,,,,9372.85,,
836,2019 HON. RALPH ABRAHAM,2174.94,289.97,888.00,,,28.75,1221.91,596.80,,7454.81,316.64,
915,2019 HON. TOM REED,,314.42,1011.90,,,,3824.50,39.98,,7044.43,302.82,
...,...,...,...,...,...,...,...,...,...,...,...,...,...
956,FISCAL YEAR 2019 ASSET MANAGEMENT,,,,,,,,,,,4.00,
960,FISCAL YEAR 2019 COMMUNICATIONS,,,,,,1858.40,,,,,,
963,FISCAL YEAR 2019 MISCELLANEOUS AUTOMOBILES,29321.67,,,,,3131.88,,,,,,
965,FISCAL YEAR 2019 OFFICE OF ATTENDING PHYSICIAN,,,,,,,,,,,72.00,


After finding some of the offices or committes with the highest totals, I look up the biographies for the offices or committees with the highest totals.

## Findings
**After reviewing some example of high spenders, I noticed the pattern of most elected officials have backgrounds as attorneys, businessmen, and politicians. Along with this, it is clear that this position favors white men, particularly those who also have army service. Lastly, I noticed that many of the top spenders in this organization have sexual misconduct allegations in their wikipedia pages.**