# Congressional Activity - Data Prep
***
**Project:** Congressional Activity  
**Author:** Tami McManus  
**Last Updated:** September 19, 2023

This notebook reads in raw congressional activity data from an Excel file and transforms it into "tidy" dataframes. The data is then saved to a multiple worksheet Excel document.
  
Congressional Activity Resumes were originally pulled from:  
https://www.senate.gov/

The PDF files were copied and pasted into Excel and minor updates were made for consistency. This "raw" data file serves as input to the data wrangling process.

***
# Notebook Setup
***

Library versions used in original notebook:  
pandas -- 1.4.2 

In [1]:
# Import libraries
import pandas as pd

In [2]:
# Check libraries Versions
print(f'Import Complete: {pd.__name__} {pd.__version__}')

Import Complete: pandas 1.4.3


In [3]:
# If you don't have openpyxl installed, uncomment the following command
# to install it. This notebook was created and tested with openpyxl version 3.1.2

#!pip install openpyxl

***  
# Read Raw Activity Data
***

In [4]:
# Read in all worksheets, dividing the data into general legislative activity and confirmation related activity
file_name = '../Data/Resume Data - Raw.xlsx'
raw_data_dict = pd.read_excel(file_name, sheet_name=None, header=None, skiprows=3, usecols='A:C')
raw_conf_dict = pd.read_excel(file_name, sheet_name=None, header=None, skiprows=4, usecols='F:G')

***
# Setup Data Processing
***

In [5]:
# Create empty dataframes to hold the final data
gen_activity_df = pd.DataFrame()
measures_df = pd.DataFrame()
confirm_df = pd.DataFrame()

In [6]:
# Define column labels by group for easier processing later on
key_cols = ['Year', 'Congress', 'Session', 'Chamber']
gen_cols = ['Days in session', 'Time in session', '...Pages of proceedings', '...Extension of Remarks',
            'Public bills enacted into law', 'Private bills enacted into law', 'Bills in conference',
            'Bills through conference', 'Special reports', 'Conference reports', 'Measures pending on calendar', 
            'Quorum calls', 'Yea-and-nay votes', 'Recorded votes', 'Bills vetoed', 'Vetoes overridden', 'Bills not signed']
measure_cols = ['Measures passed, total', '...Senate bills', '...House bills', '...Senate joint resolutions', 
                '...House joint resolutions', '...Senate concurrent resolutions', '...House concurrent resolutions',
                '...Simple resolutions', 'Measures reported, total', '...Senate bills', '...House bills', 
                '...Senate joint resolutions', '...House joint resolutions', '...Senate concurrent resolutions',
                '...House concurrent resolutions', '...Simple resolutions', 'Measures introduced, total', '...Bills', 
                '...Joint resolutions', '...Concurrent resolutions', '...Simple resolutions']

***
# Process Data - General and Legislative Activity
***

In [7]:
# Loop through the data in each worksheet and extract general and legislative activity
for key in raw_data_dict.keys():
    # Create a dataframe using the raw data read from the Excel worksheet
    raw_df = raw_data_dict[key].copy()
    raw_df.set_index(0, inplace=True) # Moved from later code block
    raw_df = raw_df.rename_axis(None)

    # Split the worksheet name into year, congress, and session
    i = key.find(' - ')
    j = key.find('.')
    year = key[:i]
    congress = key[i+3:j]
    session = key[j+1]
    
    # Add key columns to the dataframe
    key_df = pd.DataFrame({'Year': [year, year], 'Congress': [congress, congress],
                           'Session': [session, session], 'Chamber': ['Senate', 'House']}).transpose()
    key_df.columns = [1, 2]
    raw_df = pd.concat([key_df, raw_df]).copy()
        
    # Label the Senate and House data columns, remove the Congressional Record row, and any empty rows
    raw_df.rename({1:'Senate', 2:'House'}, axis=1, inplace=True)
    raw_df.drop(index='Congressional Record:', inplace=True)
    raw_df = raw_df[raw_df.index.notnull()]
    
    # Iterate through the data for the senate, then the house
    for chamber in ['Senate', 'House']:
        # Filter the data for the chamber in question and transpose it
        raw_chamber_df = raw_df[[chamber]].transpose().copy()
       
        # Drop the measure columns for the general activity dataframe and the general columns for the measures datafame 
        raw_gen_activity_df = raw_chamber_df.drop(measure_cols, axis=1, errors='ignore')
        raw_measures_df = raw_chamber_df.drop(gen_cols, axis=1, errors='ignore')
        
        # Cleanup the columns headings for the measures dataframe
        new_col_names = []
        for i in raw_measures_df.columns:
            if i.startswith('...') == False:
                prefix = i
                new_col_names.append(i)
            else:
                new_col_names.append(prefix + i)
        raw_measures_df.columns = new_col_names
                
        # Add cleaned and transposed data to the final data frames
        gen_activity_df = pd.concat([gen_activity_df, raw_gen_activity_df], ignore_index=True).copy()
        measures_df = pd.concat([measures_df, raw_measures_df], ignore_index=True).copy()

In [8]:
# Review the first few rows of the general activity dataframe to confirm format
gen_activity_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Days in session,Time in session,...Pages of proceedings,...Extension of Remarks,Public bills enacted into law,Private bills enacted into law,...,Bills through conference,Special reports,Conference reports,Measures pending on calendar,Quorum calls,Yea-and-nay votes,Recorded votes,Bills vetoed,Vetoes overridden,Bills not signed
0,1983,98,1,Senate,150,"1,010 hrs., 47'",,,80,,...,4,25,4.0,152,18,381,,3.0,1,
1,1983,98,1,House,146,"851 hrs., 45'",,,75,4.0,...,29,45,33.0,102,35,297,201.0,3.0,1,
2,1984,98,2,Senate,131,"940 hrs., 28'",14612.0,,112,14.0,...,21,11,,221,19,292,,2.0,1,
3,1984,98,2,House,120,"852 hrs., 59'",12284.0,,133,11.0,...,27,40,53.0,120,55,227,181.0,1.0,1,
4,1985,99,1,Senate,170,"1,252 hrs., 31'",18418.0,,110,,...,8,18,2.0,94,20,381,,,1,


In [9]:
# Review the first few rows of the legislative measures dataframe to confirm format
measures_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,"Measures passed, total","Measures passed, total...Senate bills","Measures passed, total...House bills","Measures passed, total...Senate joint resolutions","Measures passed, total...House joint resolutions","Measures passed, total...Senate concurrent resolutions",...,"Measures reported, total...Senate joint resolutions","Measures reported, total...House joint resolutions","Measures reported, total...Senate concurrent resolutions","Measures reported, total...House concurrent resolutions","Measures reported, total...Simple resolutions","Measures introduced, total","Measures introduced, total...Bills","Measures introduced, total...Joint resolutions","Measures introduced, total...Concurrent resolutions","Measures introduced, total...Simple resolutions"
0,1983,98,1,Senate,596,170,99,89,34,25,...,87.0,9,19.0,2.0,139,2795,2196,209,86,302
1,1983,98,1,House,611,70,234,46,49,16,...,,9,3.0,7.0,132,5642,4580,440,237,385
2,1984,98,2,Senate,726,159,239,90,55,25,...,99.0,16,17.0,4.0,122,1302,897,150,69,186
3,1984,98,2,House,737,128,323,67,61,18,...,2.0,12,,2.0,105,2462,1862,223,142,235
4,1985,99,1,Senate,583,106,93,120,54,29,...,118.0,19,16.0,,100,2651,2000,255,102,294


***
# Process Data - Confirmations
***

In [10]:
for key in raw_conf_dict.keys():
    # Copy the dataframe from the current worksheet
    raw_df = raw_conf_dict[key].copy()
    raw_df.set_index(5, inplace=True)
    raw_df = raw_df.rename_axis(None)

    # Split the worksheet name into year, congress, and session
    i = key.find(' - ')
    j = key.find('.')
    year = key[:i]
    congress = key[i+3:j]
    session = key[j+1]
    
    # Remove NaN Indexes
    raw_df = raw_df[raw_df.index.notnull()]
    
    # Cleanup the column headings for the confirmations dataframe
    raw_conf_df = pd.DataFrame()
    for i, row in raw_df.iterrows():
        carryover = 0
        if row.name.startswith('...') == False:
            # Split the label from the count for each section heading
            if ', totaling ' in row.name:
                label, count = row.name.split(', totaling ')
                count = count.split(', ')[0].replace(',', '')
                
                # Split out the number of carryover nominations and exclude this from total so we don't double count
                if ' (inc' in label:
                    label, carryover = label.split(' (including ')
                    carryover = int(carryover.split()[0].replace(',', ''))
                if ' (inc' in count:
                    count, carryover = count.split(' (including ')
                    carryover = int(carryover.split()[0].replace(',', ''))
            else:
                label = row.name
                count = row.item()
            # Add the section label to each row to distinguish branch from action
            prev_label = label
            
        else:
            label = prev_label + row.name
            count = row.item()
        
        # Remove commas from numeric data and convert from strings to ints
        if isinstance(count, str) and ' ' in count:
            count = int(count.split(' ')[0])
        if carryover > 0:
            count = int(count.replace(',', '')) -  carryover
        raw_conf_df = pd.concat([raw_conf_df, pd.DataFrame({label: [count]}).transpose()]).copy()
        
    # Transpose the dataframe
    raw_conf_df = raw_conf_df.transpose().copy()
        
    # Add key columns
    key_df = pd.DataFrame({'Year': [year], 'Congress': [congress], 'Session': [session], 'Chamber': ['Senate']})
    raw_conf_df = pd.concat([key_df, raw_conf_df], axis=1).copy() 
                
    # Add cleaned and transposed data to the final data frames
    confirm_df = pd.concat([confirm_df, raw_conf_df], ignore_index=True).copy()

In [11]:
# Review the first few rows of the legislative measures dataframe to confirm format
confirm_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Army nominations,Army nominations...Confirmed,Army nominations...Failed at August-September adjournment,Army nominations...Failed at November 18 sine die adjournment,Navy nominations,Navy nominations...Confirmed,...,Other civilian nominations,Other civilian nominations...Confirmed,Other civilian nominations...Withdrawn,Other civilian nominations...Returned to White House,Summary...Total nominations received this Session,Other Civilian nominations...Withdrawn,Space Force nominations,Space Force nominations...Confirmed,Space Force nominations...Unconfirmed,Space Force nominations...Withdrawn
0,1983,98,1,Senate,14784.0,14782.0,1.0,1.0,21994,21994.0,...,,,,,,,,,,
1,1984,98,2,Senate,14031.0,14031.0,,,8855,8855.0,...,,,,,,,,,,
2,1985,99,1,Senate,15370.0,14478.0,,,16721,16720.0,...,,,,,,,,,,
3,1986,99,2,Senate,9918.0,10810.0,,,9952,9952.0,...,,,,,,,,,,
4,1987,100,1,Senate,14448.0,12086.0,,,12101,12055.0,...,,,,,,,,,,


***
# Cleanup Data - General
***

In [12]:
# Review the first few rows of the general activity dataframe
gen_activity_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Days in session,Time in session,...Pages of proceedings,...Extension of Remarks,Public bills enacted into law,Private bills enacted into law,...,Bills through conference,Special reports,Conference reports,Measures pending on calendar,Quorum calls,Yea-and-nay votes,Recorded votes,Bills vetoed,Vetoes overridden,Bills not signed
0,1983,98,1,Senate,150,"1,010 hrs., 47'",,,80,,...,4,25,4.0,152,18,381,,3.0,1,
1,1983,98,1,House,146,"851 hrs., 45'",,,75,4.0,...,29,45,33.0,102,35,297,201.0,3.0,1,
2,1984,98,2,Senate,131,"940 hrs., 28'",14612.0,,112,14.0,...,21,11,,221,19,292,,2.0,1,
3,1984,98,2,House,120,"852 hrs., 59'",12284.0,,133,11.0,...,27,40,53.0,120,55,227,181.0,1.0,1,
4,1985,99,1,Senate,170,"1,252 hrs., 31'",18418.0,,110,,...,8,18,2.0,94,20,381,,,1,


In [13]:
# Remove the indent marker (...) from column headings and format all headings as titles for consistency
gen_activity_df.columns = gen_activity_df.columns.str.strip('...')
gen_activity_df.columns = gen_activity_df.columns.str.title()

In [14]:
# Replace all NaN values with 0
gen_activity_df.fillna(0, inplace=True)

In [15]:
# Convert all numeric columns to int
cols = gen_activity_df.columns.to_list()
int_cols = [x for x in cols if x not in ['Chamber', 'Time In Session', 'Pages Of Proceedings', 'Extension Of Remarks']]
gen_activity_df[int_cols] = gen_activity_df[int_cols].astype(int)

In [16]:
# Reformat the time column as ##h ##m for ease of analysis
gen_activity_df['Time In Session'] = gen_activity_df['Time In Session'].str.replace(' hrs.,', 'h', regex=True)
gen_activity_df['Time In Session'] = gen_activity_df['Time In Session'].str.replace(',', '', regex=True)
gen_activity_df['Time In Session'] = gen_activity_df['Time In Session'].str.replace('\'', 'm', regex=True)
gen_activity_df['Time In Session'].fillna('0h 0m', inplace=True)

In [17]:
# Review the first few rows of the cleaned general activity dataframe
gen_activity_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Days In Session,Time In Session,Pages Of Proceedings,Extension Of Remarks,Public Bills Enacted Into Law,Private Bills Enacted Into Law,...,Bills Through Conference,Special Reports,Conference Reports,Measures Pending On Calendar,Quorum Calls,Yea-And-Nay Votes,Recorded Votes,Bills Vetoed,Vetoes Overridden,Bills Not Signed
0,1983,98,1,Senate,150,1010h 47m,0,0,80,0,...,4,25,4,152,18,381,0,3,1,0
1,1983,98,1,House,146,851h 45m,0,0,75,4,...,29,45,33,102,35,297,201,3,1,0
2,1984,98,2,Senate,131,940h 28m,14612,0,112,14,...,21,11,0,221,19,292,0,2,1,0
3,1984,98,2,House,120,852h 59m,12284,0,133,11,...,27,40,53,120,55,227,181,1,1,0
4,1985,99,1,Senate,170,1252h 31m,18418,0,110,0,...,8,18,2,94,20,381,0,0,1,0


***
# Cleanup Data - Legislative Activity
***

In [18]:
# Review the first few rows of the legislative measures dataframe
measures_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,"Measures passed, total","Measures passed, total...Senate bills","Measures passed, total...House bills","Measures passed, total...Senate joint resolutions","Measures passed, total...House joint resolutions","Measures passed, total...Senate concurrent resolutions",...,"Measures reported, total...Senate joint resolutions","Measures reported, total...House joint resolutions","Measures reported, total...Senate concurrent resolutions","Measures reported, total...House concurrent resolutions","Measures reported, total...Simple resolutions","Measures introduced, total","Measures introduced, total...Bills","Measures introduced, total...Joint resolutions","Measures introduced, total...Concurrent resolutions","Measures introduced, total...Simple resolutions"
0,1983,98,1,Senate,596,170,99,89,34,25,...,87.0,9,19.0,2.0,139,2795,2196,209,86,302
1,1983,98,1,House,611,70,234,46,49,16,...,,9,3.0,7.0,132,5642,4580,440,237,385
2,1984,98,2,Senate,726,159,239,90,55,25,...,99.0,16,17.0,4.0,122,1302,897,150,69,186
3,1984,98,2,House,737,128,323,67,61,18,...,2.0,12,,2.0,105,2462,1862,223,142,235
4,1985,99,1,Senate,583,106,93,120,54,29,...,118.0,19,16.0,,100,2651,2000,255,102,294


In [19]:
# Remove the totals columns, as we can compute these as needed
measures_df.drop(measures_df.filter(regex='total$').columns, axis=1, inplace=True)

In [20]:
# Transpose the measures dataframe to move variable columns to rows
measures_df = measures_df.melt(id_vars=key_cols)

In [21]:
# Delete NaN rows
measures_df = measures_df.loc[measures_df['value'].notna()]

In [22]:
# Separate the measures variable column into Measure Type and Action
cols = key_cols.copy()
cols.append('value')
measures_df = pd.concat([measures_df[cols], measures_df['variable'].str.split(', total...', expand=True)], axis=1)

In [23]:
# Label the Type and Action columns and clean up the data in each
measures_df.rename(columns={0: 'Action', 1: 'Measure Type', 'value': 'Count'}, inplace=True)
measures_df['Action'] = measures_df['Action'].str[9:]
measures_df['Action'] = measures_df['Action'].str.title()

In [24]:
# Convert all numeric columns to int
cols = measures_df.columns.to_list()
int_cols = [x for x in cols if x not in ['Chamber', 'Measure Type', 'Action']]
measures_df[int_cols] = measures_df[int_cols].astype(int)

In [25]:
# Reorder the columns for ease of use during analysis
measures_df = measures_df[key_cols + ['Measure Type', 'Action', 'Count']].copy()

In [26]:
# Review the first few rows of the cleaned legislative measures dataframe
measures_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Measure Type,Action,Count
0,1983,98,1,Senate,Senate bills,Passed,170
1,1983,98,1,House,Senate bills,Passed,70
2,1984,98,2,Senate,Senate bills,Passed,159
3,1984,98,2,House,Senate bills,Passed,128
4,1985,99,1,Senate,Senate bills,Passed,106


***
# Cleanup Data - Confirmations
***

In [27]:
# Review the first few rows of the confirmation dataframe
confirm_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Army nominations,Army nominations...Confirmed,Army nominations...Failed at August-September adjournment,Army nominations...Failed at November 18 sine die adjournment,Navy nominations,Navy nominations...Confirmed,...,Other civilian nominations,Other civilian nominations...Confirmed,Other civilian nominations...Withdrawn,Other civilian nominations...Returned to White House,Summary...Total nominations received this Session,Other Civilian nominations...Withdrawn,Space Force nominations,Space Force nominations...Confirmed,Space Force nominations...Unconfirmed,Space Force nominations...Withdrawn
0,1983,98,1,Senate,14784.0,14782.0,1.0,1.0,21994,21994.0,...,,,,,,,,,,
1,1984,98,2,Senate,14031.0,14031.0,,,8855,8855.0,...,,,,,,,,,,
2,1985,99,1,Senate,15370.0,14478.0,,,16721,16720.0,...,,,,,,,,,,
3,1986,99,2,Senate,9918.0,10810.0,,,9952,9952.0,...,,,,,,,,,,
4,1987,100,1,Senate,14448.0,12086.0,,,12101,12055.0,...,,,,,,,,,,


In [28]:
# Drop all summary columns
confirm_df.drop(confirm_df.filter(regex='(?i)Summary').columns, axis=1, inplace=True)

In [29]:
# Transpose the confirmation dataframe to move variable columns to rows
confirm_df = confirm_df.melt(id_vars=key_cols)

In [30]:
# Delete rows NaN rows
confirm_df = confirm_df.loc[confirm_df['value'].notna()]

In [31]:
# Separate the confirmation variable column
cols = key_cols.copy()
cols.append('value')
confirm_df = pd.concat([confirm_df[cols], confirm_df['variable'].str.split('\.\.\.', expand=True)], axis=1)

In [32]:
# Label the Type and Action columns
confirm_df.rename(columns={0: 'Branch', 1: 'Action', 'value': 'Count'}, inplace=True)

In [33]:
# Set missing action cells to Nominations
confirm_df['Action'].fillna('Nomination', inplace=True)

In [34]:
# For simplicity, group all civilian nominations together
confirm_df.loc[confirm_df['Branch'].str.contains('Civilian', case=False), 'Branch'] = 'Civilian'

In [35]:
# Relabel recess appointment activity for consistency
confirm_df.loc[confirm_df['Action'].str.contains('Recess', case=False), 'Action'] = 'Recess Appointment'

In [36]:
# Remove "nominations" from Branch values
confirm_df['Branch'] = confirm_df['Branch'].str.replace(' nominations', '', regex=True)

In [37]:
# Convert numeric columns to int
cols = confirm_df.columns.to_list() 
int_cols = [x for x in cols if x not in ['Chamber', 'Branch', 'Action']]
confirm_df[int_cols] = confirm_df[int_cols].astype(int)

In [38]:
# Reorder the columns for ease of use during analysis
confirm_df = confirm_df[key_cols + ['Branch', 'Action', 'Count']].copy()

In [39]:
# Group all failed / returned nominations together and relabel
confirm_df.loc[confirm_df['Action'].str.contains('Returned|Failed'), 'Action'] = 'Returned to White House'
confirm_df = confirm_df.groupby(by=['Year', 'Congress', 'Session', 'Chamber', 'Branch', 'Action']).sum()
confirm_df.reset_index(inplace=True)

In [40]:
# Review the first few rows of the cleaned legislative measures dataframe
confirm_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Branch,Action,Count
0,1983,98,1,Senate,Air Force,Confirmed,12792
1,1983,98,1,Senate,Air Force,Nomination,12819
2,1983,98,1,Senate,Air Force,Returned to White House,1
3,1983,98,1,Senate,Air Force,Unconfirmed,26
4,1983,98,1,Senate,Army,Confirmed,14782


***
# Preview Data
***

In [41]:
gen_activity_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Days In Session,Time In Session,Pages Of Proceedings,Extension Of Remarks,Public Bills Enacted Into Law,Private Bills Enacted Into Law,...,Bills Through Conference,Special Reports,Conference Reports,Measures Pending On Calendar,Quorum Calls,Yea-And-Nay Votes,Recorded Votes,Bills Vetoed,Vetoes Overridden,Bills Not Signed
0,1983,98,1,Senate,150,1010h 47m,0,0,80,0,...,4,25,4,152,18,381,0,3,1,0
1,1983,98,1,House,146,851h 45m,0,0,75,4,...,29,45,33,102,35,297,201,3,1,0
2,1984,98,2,Senate,131,940h 28m,14612,0,112,14,...,21,11,0,221,19,292,0,2,1,0
3,1984,98,2,House,120,852h 59m,12284,0,133,11,...,27,40,53,120,55,227,181,1,1,0
4,1985,99,1,Senate,170,1252h 31m,18418,0,110,0,...,8,18,2,94,20,381,0,0,1,0


In [42]:
gen_activity_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 80 entries, 0 to 79
Data columns (total 21 columns):
 #   Column                          Non-Null Count  Dtype 
---  ------                          --------------  ----- 
 0   Year                            80 non-null     int32 
 1   Congress                        80 non-null     int32 
 2   Session                         80 non-null     int32 
 3   Chamber                         80 non-null     object
 4   Days In Session                 80 non-null     int32 
 5   Time In Session                 80 non-null     object
 6   Pages Of Proceedings            80 non-null     object
 7   Extension Of Remarks            80 non-null     object
 8   Public Bills Enacted Into Law   80 non-null     int32 
 9   Private Bills Enacted Into Law  80 non-null     int32 
 10  Bills In Conference             80 non-null     int32 
 11  Bills Through Conference        80 non-null     int32 
 12  Special Reports                 80 non-null     int3

In [43]:
measures_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Measure Type,Action,Count
0,1983,98,1,Senate,Senate bills,Passed,170
1,1983,98,1,House,Senate bills,Passed,70
2,1984,98,2,Senate,Senate bills,Passed,159
3,1984,98,2,House,Senate bills,Passed,128
4,1985,99,1,Senate,Senate bills,Passed,106


In [44]:
measures_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1302 entries, 0 to 1439
Data columns (total 7 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Year          1302 non-null   int32 
 1   Congress      1302 non-null   int32 
 2   Session       1302 non-null   int32 
 3   Chamber       1302 non-null   object
 4   Measure Type  1302 non-null   object
 5   Action        1302 non-null   object
 6   Count         1302 non-null   int32 
dtypes: int32(4), object(3)
memory usage: 61.0+ KB


In [45]:
confirm_df.head()

Unnamed: 0,Year,Congress,Session,Chamber,Branch,Action,Count
0,1983,98,1,Senate,Air Force,Confirmed,12792
1,1983,98,1,Senate,Air Force,Nomination,12819
2,1983,98,1,Senate,Air Force,Returned to White House,1
3,1983,98,1,Senate,Air Force,Unconfirmed,26
4,1983,98,1,Senate,Army,Confirmed,14782


In [46]:
confirm_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 701 entries, 0 to 700
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   Year      701 non-null    int64 
 1   Congress  701 non-null    int64 
 2   Session   701 non-null    int64 
 3   Chamber   701 non-null    object
 4   Branch    701 non-null    object
 5   Action    701 non-null    object
 6   Count     701 non-null    int32 
dtypes: int32(1), int64(3), object(3)
memory usage: 35.7+ KB


***
# Write to Excel
***

In [47]:
with pd.ExcelWriter('../Data/Resume Data - Scrubbed.xlsx') as writer:
    gen_activity_df.to_excel(writer, sheet_name='General Activity', index=False)
    measures_df.to_excel(writer, sheet_name='Legislative Measures', index=False)
    confirm_df.to_excel(writer, sheet_name='Confirmations', index=False)

***
**End**
***