# Analysis of Austin, TX Budget and Expenditures

Data is taken from the AustinTexas.gov website at https://data.austintexas.gov/Budget-and-Finance/Program-Budget-Operating-Budget-Vs-Expense-Raw-Dat/g5k8-8sud/about_data. 

Each row corresponds to a logged cost (either as budget vs expenditure, or just one of the two). The data includes both a Year and Quarter (1,2,3,4 for blocks of 3-months throughout the year) but all entries here only correspond to year 2024 and quarter 3, so no time analysis can be done (yet).

## Data Preparation

After downloading, all processes are done within the Azure ecosystem. The original download is stored in an Azure Blob Storage container, then carted over to an SQL Server and saved into a database using Data Factory. Any visualizations are taken from Sand Dance used in Azure Data Studio. The tool doesn't support downloading the plots, so they're shamelessly screenshotted and inserted into this Azure Notebook (which is just a Jupyter Notebook made within Azure Data Studio).

Let's see some sample rows ...

In [184]:
SELECT TOP (20) *
FROM AustinTXData.BUDGET;

BUDGET_FISCAL_YEAR,THRU_QUARTER,DEPT_ROLLUP,DEPT_ROLLUP_NAME,DEPARTMENT_CODE,DEPARTMENT_NAME,FUND_CODE,FUND_NAME,PROGRAM_CODE,PROGRAM_NAME,ACTIVITY_CODE,ACTIVITY_NAME,UNIT_CODE,UNIT_NAME,EXPENSE_CODE,EXPENSE_NAME,BUDGET,EXPENDITURES,KEY
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,5033,Jury leave,0,0.0,2024360600054608CDS8CDS87835033
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,5034,Bad weather pay,0,0.0,2024360600054608CDS8CDS87835034
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,5051,Personnel savings,-160714,0.0,2024360600054608CDS8CDS87835051
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,5280,Consultant-others,24750,0.0,2024360600054608CDS8CDS87835280
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,6361,Awards and Recognition,1560,0.0,2024360600054608CDS8CDS87836361
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,6388,Maintenance-computer software,0,0.0,2024360600054608CDS8CDS87836388
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,6531,Seminar/training fees,24000,0.0,2024360600054608CDS8CDS87836531
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,6532,Educational travel,36000,339.66,2024360600054608CDS8CDS87836532
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,6551,Mileage reimbursements,150,0.0,2024360600054608CDS8CDS87836551
2024,3,60,Public Works,6000,Capital Delivery Services,5460,Capital Projects Management Fund,8CDS,Business Enterprises,8CDS,Business Enterprises,8783,Financial Services Division,6558,Professional registration,500,300.0,2024360600054608CDS8CDS87836558


... and check the column types:

In [185]:
SELECT COLUMN_NAME, DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'BUDGET';

COLUMN_NAME,DATA_TYPE
BUDGET_FISCAL_YEAR,int
THRU_QUARTER,int
DEPT_ROLLUP,int
DEPT_ROLLUP_NAME,nvarchar
DEPARTMENT_CODE,int
DEPARTMENT_NAME,nvarchar
FUND_CODE,nvarchar
FUND_NAME,nvarchar
PROGRAM_CODE,nvarchar
PROGRAM_NAME,nvarchar


# Analysis

## Budget Types

The interpretation of the budgets vs expenditures columns is not what you would expect, as described in the dataset description from the website:

*"The comparison of actual expenditures to budget may appear inconsistent. That is because base wages for personnel are fully budgeted in the expense categories regular wages—full-time, regular wages—part-time or regular wages—civil service. **The budget does not assume expenditure levels for the various leave categories, such as sick pay, vacation pay, or jury leave.** However, actual expenses for various leave categories are recorded based on timesheet coding. The result is that **actual expenditures for regular wages are spread across multiple expense categories while the budget is shown in one expense category.***

*"Personnel savings is budgeted to account for the likely savings in personnel costs generated through attrition. However, the savings is realized in the expense categories regular wages—full-time, regular wages—part-time and regular wages—civil service. Therefore, **the actual expenditures in the personnel savings expense category will always be zero."***

Since the budgets and expenses are atypical, we should build a contingency table to see how many entries have negative, 0, or positive budget or expense. We'll define a function to compute the counts in each bivariate category below:

In [186]:
-- Drop function if exists
IF OBJECT_ID('AustinTXData.BudgetExpenditureCounts', 'TF') IS NOT NULL
    DROP FUNCTION AustinTXData.BudgetExpenditureCounts;
GO

-- Define func to get 3 x 3 contingency table of negative, 0, positive counts of
-- Budget vs Expenditure
CREATE FUNCTION AustinTXData.BudgetExpenditureCounts()
RETURNS @ResultTable TABLE 
(
    BudgetCategory NVARCHAR(50),
    Expenditures_LT_0 INT,
    Expenditures_EQ_0 INT,
    Expenditures_GT_0 INT,
    ColSum INT
)
AS
BEGIN
    
    INSERT INTO @ResultTable
    SELECT 'Budget < 0' AS BudgetCategory,
           SUM(CASE WHEN BUDGET < 0 AND EXPENDITURES < 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET < 0 AND EXPENDITURES = 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET < 0 AND EXPENDITURES > 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET < 0 THEN 1 ELSE 0 END)
    FROM AustinTXData.BUDGET
    UNION ALL
    SELECT 'Budget = 0',
           SUM(CASE WHEN BUDGET = 0 AND EXPENDITURES < 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET = 0 AND EXPENDITURES = 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET = 0 AND EXPENDITURES > 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET = 0 THEN 1 ELSE 0 END)
    FROM AustinTXData.BUDGET
    UNION ALL
    SELECT 'Budget > 0',
           SUM(CASE WHEN BUDGET > 0 AND EXPENDITURES < 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET > 0 AND EXPENDITURES = 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET > 0 AND EXPENDITURES > 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN BUDGET > 0 THEN 1 ELSE 0 END)
    FROM AustinTXData.BUDGET
    UNION ALL
    SELECT 'RowSum',
           SUM(CASE WHEN EXPENDITURES < 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN EXPENDITURES = 0 THEN 1 ELSE 0 END),
           SUM(CASE WHEN EXPENDITURES > 0 THEN 1 ELSE 0 END),
           COUNT(*) -- just get totat table rows
    FROM AustinTXData.BUDGET;

    RETURN;
END;
GO

Now let's see both the counts in the contingency table and their percentages:

In [187]:
-- show counts table
SELECT * FROM AustinTXData.BudgetExpenditureCounts();

-- show percentages table
WITH counts AS 
(
    SELECT * FROM AustinTXData.BudgetExpenditureCounts()
),

total_counts AS 
(
    SELECT ColSum AS total_size
    FROM counts
    WHERE BudgetCategory = 'RowSum'
)

SELECT 
    BudgetCategory,
    CAST(Expenditures_LT_0 * 100.0 / total_counts.total_size AS DECIMAL(5,2)) AS Expenditures_LT_0,
    CAST(Expenditures_EQ_0 * 100.0 / total_counts.total_size AS DECIMAL(5,2)) AS Expenditures_EQ_0,
    CAST(Expenditures_GT_0 * 100.0 / total_counts.total_size AS DECIMAL(5,2)) AS Expenditures_GT_0,
    CAST(ColSum * 100.0 / total_counts.total_size AS DECIMAL(5,2)) AS ColSum
FROM counts, total_counts;

BudgetCategory,Expenditures_LT_0,Expenditures_EQ_0,Expenditures_GT_0,ColSum
Budget < 0,254,1026,7,1287
Budget = 0,489,12817,14729,28035
Budget > 0,49,6547,22441,29037
RowSum,792,20390,37177,58359


BudgetCategory,Expenditures_LT_0,Expenditures_EQ_0,Expenditures_GT_0,ColSum
Budget < 0,0.44,1.76,0.01,2.21
Budget = 0,0.84,21.96,25.24,48.04
Budget > 0,0.08,11.22,38.45,49.76
RowSum,1.36,34.94,63.7,100.0


So from the margins, we can see that:

1. **Budgets:** 2% have negative budgets, 48% have zero-valued budgets, and the other 50% have positive budgets.

2. **Expenditures:** A bit over 1% have negative expenditures, 35% have zero-valued expenditures, and almost 64% have positive expenditures

There are also additional specific observations:

1. Almost 22% of the table is comprised of entries that have neither budget or expense.

2. 0.01% (7 rows) have negative budget yet positive expenditure.



## Questions:

We have a little more insight into the data, so now we can pose some questions:

1. Which departments are responsible for the majority of these logs?

2. When grouped by department and for records with both positive budgets and expenditures, did any go beyond their budget?

3. What were the most costly expenditures?

So that we can re-use code with ease, we'll create some stored procedures that can take in an arbitrary column (so long as its categorical data). First we'll need to create a user-defined Type for our Table so that we can pass tables as parameters to the stored procedures to follow.

In [188]:
-- Delete TableType if cell has already been run before
IF NOT EXISTS (SELECT * FROM sys.types WHERE is_table_type = 1 AND name = 'TableType')
BEGIN
    CREATE TYPE TableType AS TABLE
    (
        BUDGET_FISCAL_YEAR INT,
        THRU_QUARTER INT,
        DEPT_ROLLUP INT,
        DEPT_ROLLUP_NAME NVARCHAR(MAX),  
        DEPARTMENT_CODE INT,
        DEPARTMENT_NAME NVARCHAR(MAX),   
        FUND_CODE NVARCHAR(MAX),         
        FUND_NAME NVARCHAR(MAX),         
        PROGRAM_CODE NVARCHAR(MAX),      
        PROGRAM_NAME NVARCHAR(MAX),      
        ACTIVITY_CODE NVARCHAR(MAX),     
        ACTIVITY_NAME NVARCHAR(MAX),     
        UNIT_CODE NVARCHAR(MAX),         
        UNIT_NAME NVARCHAR(MAX),         
        EXPENSE_CODE NVARCHAR(MAX),      
        EXPENSE_NAME NVARCHAR(MAX),      
        BUDGET FLOAT,
        EXPENDITURES FLOAT,
        [KEY] NVARCHAR(MAX)
    )
END;
GO

Now we can define the procedures. One will be to simply count the number of distinct elements and the other to show the counts (as well as percentages and cumulative percentages) of each category.

Here's the stored procedure to calculate the number of unique elements for a category:

In [189]:
IF OBJECT_ID('NumUnique', 'P') IS NOT NULL
    DROP PROCEDURE NumUnique;
GO

CREATE PROCEDURE NumUnique
(
    @Table_ TableType READONLY,  -- Table to read over
    @Col NVARCHAR(128)           -- col to count distinct types for
)
AS
BEGIN
    DECLARE @SQL NVARCHAR(MAX);

    -- Note: Can't put the QUOTENAME around @Col for the str concat after NumUnique_ - Will break!
    -- spent *lots* of time trying to debug this in wrong directions.. :(
    SET @SQL = N'
    SELECT
        COUNT(DISTINCT ' + QUOTENAME(@Col) + ') AS NumUnique_' + @Col + '
    FROM @Table_';
    
    EXEC sp_executesql @SQL, N'@Table_ TableType READONLY', @Table_;
END;
GO

And here's the more-involved stored procedure that returns the ...

1. unique value of the category
2. raw number of counts
3. percentage of those counts out of total,
4. cumulative percentage of the counts for all unique values up to that row.

Additionally, we'll also show the

5. budget (in millions of \$'s),
6. expenditures (in millions of \$'s),
7. how much under-budget (e.g. budget - expense, also in millions of \$'s)
8. the percentage under-budget (e.g. 100 * [1 - expenses / budget]). Note that this can be > 100%.

In [190]:
IF OBJECT_ID('CategStats', 'P') IS NOT NULL
    DROP PROCEDURE CategStats;
GO

CREATE PROCEDURE CategStats
(
    @Table_ TableType READONLY,  -- Table to read over
    @Col NVARCHAR(128),          -- categ col to group by
    @cutoff FLOAT = 100.0,       -- only get rows up to this cumul perc
    @showbudget INT = 1          -- show budget-related cols if 1 / hide if 0
)
AS
BEGIN
    DECLARE @SQL NVARCHAR(MAX);

    SET @SQL = N'
    WITH dept_count AS
    (
        -- Step 1: Aggregate the data from the passed table
        SELECT
            ' + QUOTENAME(@Col) + ' AS Col,
            COUNT(*) AS count_,
            ' + CASE WHEN @showbudget = 1 
                     THEN 'ROUND(SUM(BUDGET) / 1000000.0, 3) AS budget_M,
                           ROUND(SUM(EXPENDITURES) / 1000000.0, 3) AS expend_M' 
                     ELSE 'NULL AS budget_M, 
                           NULL AS expend_M' 
                END + '
        FROM 
            @Table_
        GROUP BY
            ' + QUOTENAME(@Col) + '
    ),

    perc_calc AS
    (
        SELECT
            Col,
            count_,
            budget_M,
            expend_M,
            ' + CASE WHEN @showbudget = 1 
                     THEN 'ROUND(budget_M - expend_M, 3) AS under_budget_M,
                           CASE WHEN budget_M = 0 THEN NULL ELSE ROUND(100 * (1 - expend_M / budget_M), 2) END AS perc_under_budget' 
                     ELSE 'NULL AS under_budget_M, NULL AS perc_under_budget'
                END + ',
            ROUND(count_ * 100.0 / SUM(count_) OVER (), 2) AS perc
        FROM
            dept_count
    ),
    
    cumul_calc AS
    (
        SELECT
            Col,
            count_,
            budget_M,
            expend_M,
            under_budget_M,
            perc_under_budget,
            perc,
            SUM(perc) OVER (ORDER BY count_ DESC) AS cumul_perc
        FROM
            perc_calc
    )
    
    SELECT
        Col AS [' + @Col + '],
        count_,
        CAST(perc AS DECIMAL(4,2)) AS perc,
        CAST(cumul_perc AS DECIMAL(5,2)) AS cumul_perc' +
        CASE WHEN @showbudget = 1 
             THEN ', 
                   budget_M, 
                   expend_M, 
                   under_budget_M,
                   perc_under_budget' 
             ELSE '' 
        END + '
    FROM
        cumul_calc
    WHERE
        cumul_perc <= ' + CAST(@cutoff AS NVARCHAR(10)) + ' -- Apply the cutoff to limit rows by cumulative percentage
    ORDER BY
        count_ DESC;
    ';

    EXEC sp_executesql @SQL, N'@Table_ TableType READONLY', @Table_;
END;
GO

With these procedures, we can easily analyze the distribution among the different columns. Now let's revisit the first question:

### **Q1: Which departments are responsible for the majority of these logs?**

In [191]:
-- The procedure was defined to accept a table parameter, so we have to copy the
-- original table to a table variable to accomadate.
DECLARE @SubTable TableType;
INSERT INTO @SubTable
SELECT * 
FROM AustinTXData.budget;

-- Get stats by Department
EXEC NumUnique @SubTable, 'DEPARTMENT_NAME';
EXEC CategStats @SubTable, 'DEPARTMENT_NAME', 99, 1;

NumUnique_DEPARTMENT_NAME
66


DEPARTMENT_NAME,count_,perc,cumul_perc,budget_M,expend_M,under_budget_M,perc_under_budget
Austin Energy,7112,12.19,12.19,1163.632,1118.023,45.609,3.92
Austin Water,7039,12.06,24.25,785.946,670.713,115.233,14.66
Parks and Recreation,7016,12.02,36.27,141.934,129.55,12.384,8.73
Police,4509,7.73,44.0,485.871,452.031,33.84,6.96
Transportation and Public Works,2852,4.89,48.89,228.288,179.228,49.06,21.49
Austin Public Health,2429,4.16,53.05,76.106,59.506,16.6,21.81
Watershed Protection,1991,3.41,56.46,115.191,101.474,13.717,11.91
Development Services,1966,3.37,59.83,114.609,102.888,11.721,10.23
Aviation,1855,3.18,63.01,376.302,247.226,129.076,34.3
Austin Resource Recovery,1677,2.87,65.88,126.102,113.952,12.15,9.64


The top 3 are about evenly collectively responsible for 36% of the logs with diminishing percentages for the subsequent departments. It almost matches a stereotypical Pareto distribution ("20% of the members are responsible for 80% of the output", although here's that's more like 70%). However, these results include the logs with negative budgets as well as zero-valued budgets and expenditures. The plot of the percentage as a function of department (up to cumulative 99%) with color cording according to the total budget (in millions of dollars) is shown below:

![title](plots/perc_DepartmentName_BudgetM.png)

### **Q1 Answer: Austin Energy (12.19%), Austin Water (12.06%), Parks and Recreation (12.02%), Police (7.73%), and Transportation and Public Works (4.89%) generated almost 49% of the logs - but this does not exclude "atypical" logs involving zero-valued budget and expenditures.**

### **Q2: When grouped by department and for records with both positive budgets and expenditures, did any go beyond their budget?**

Now that we can analyze categorical data easily and effectively, we'll look at the distributions for data where both budget AND expense are > 0. 

In [192]:
--Create table just for budgets, expenses > 0
DECLARE @SubTable TableType;
INSERT INTO @SubTable
SELECT * 
FROM AustinTXData.budget
WHERE BUDGET > 0 AND EXPENDITURES > 0;

-- Compute the stats
EXEC NumUnique @SubTable, 'DEPARTMENT_NAME';
EXEC CategStats @SubTable, 'DEPARTMENT_NAME', 99, 1;

NumUnique_DEPARTMENT_NAME
59


DEPARTMENT_NAME,count_,perc,cumul_perc,budget_M,expend_M,under_budget_M,perc_under_budget
Austin Water,3230,14.39,14.39,789.785,654.601,135.184,17.12
Parks and Recreation,2897,12.91,27.3,150.741,125.58,25.161,16.69
Austin Energy,2761,12.3,39.6,1264.575,1116.959,147.616,11.67
Police,1662,7.41,47.01,506.572,413.447,93.125,18.38
Transportation and Public Works,1199,5.34,52.35,234.894,170.239,64.655,27.53
Austin Public Health,846,3.77,56.12,68.36,55.824,12.536,18.34
Austin Resource Recovery,799,3.56,59.68,128.545,108.548,19.997,15.56
Watershed Protection,754,3.36,63.04,121.382,93.904,27.478,22.64
Development Services,719,3.2,66.24,121.614,95.404,26.21,21.55
Aviation,715,3.19,69.43,296.022,239.303,56.719,19.16


Here's a plot of the percent under-budget as a function of Department with a color coding for the percentage of logs - and again, this is *with non-positive budgets, expenditures excluded*.

![title](plots/percUnderBudget_Department_perc_2.png)

### **Q2 Answer: None of the departments (up to cumulative 99%) went beyond their budget - The two departments with the lowest percent under-budget amounts were Austin Energy (11.67%), Human Resources (11.18%), and Mayor and Council (5.16%).**

### **Q3: What were the most costly expenditures?**

First, let's see how many different categories there are under EXPENSE_NAME.

In [193]:
--Create table for original as TVP
DECLARE @SubTable TableType;
INSERT INTO @SubTable
SELECT * 
FROM AustinTXData.budget;

-- Compute the stats
EXEC NumUnique @SubTable, 'EXPENSE_NAME';
EXEC CategStats @SubTable, 'EXPENSE_NAME', 40, 0;


NumUnique_EXPENSE_NAME
513


EXPENSE_NAME,count_,perc,cumul_perc
Medicare tax,1294,2.22,2.22
FICA tax,1292,2.21,4.43
Sick pay,1278,2.19,6.62
Awards and Recognition,1262,2.16,8.78
Insurance-health/life/dental,1260,2.16,10.94
Administrative leave,1257,2.15,15.24
Vacation pay,1257,2.15,15.24
Personal holiday pay,1255,2.15,17.39
Contribution to employees ret,1254,2.15,19.54
Holiday pay,1250,2.14,21.68


There's 513 different expense justifications in the table and their distribution seems almost uniform over the first 15 largest categories, but then it starts to drop. Let's next sort the expenses in descending order and group them over the expense names. 

But there can be some peculiarities with negative budgets and expenditures... so we'll simultaneously sum the *absolute value* of the expenditures and budgets and subtract them from the original sums. If there's a significant difference, then it means there's a sizable amount of negative budget or expenditure, and if not, then it's not worth investigating.

In [194]:
WITH expense AS (
    SELECT
        EXPENSE_NAME,
        CAST(SUM(EXPENDITURES) / 1000000.0 AS DECIMAL(8,3)) AS expend_M,
        CAST(SUM(ABS(EXPENDITURES)) / 1000000.0 AS DECIMAL(8,3)) AS expend_abs_M,
        SUM(SUM(EXPENDITURES)) OVER () / 1000000.0 AS total_expend_M,
        CAST(SUM(BUDGET) / 1000000.0 AS DECIMAL(8,3)) AS budget_M,
        CAST(SUM(ABS(BUDGET)) / 1000000.0 AS DECIMAL(8,3)) AS budget_abs_M,
        SUM(SUM(BUDGET)) OVER () / 1000000.0 AS total_budget_M
    FROM
        AustinTXData.budget
    GROUP BY
        EXPENSE_NAME
)

SELECT TOP (30)
    EXPENSE_NAME,
    expend_M,
    expend_M - expend_abs_M AS expend_diff,
    CAST(100 * expend_M / total_expend_M AS DECIMAL(6,2)) AS perc_expend,
    budget_M,
    budget_M - budget_abs_M AS budget_diff,
    CAST(100 * budget_M / total_budget_M AS DECIMAL(6,2)) AS perc_budget
FROM   
    expense
ORDER BY
    expend_M DESC;

EXPENSE_NAME,expend_M,expend_diff,perc_expend,budget_M,budget_diff,perc_budget
Regular wages - full-time,687.023,0.0,10.92,959.558,0.0,13.47
Services-other,273.559,-0.489,4.35,345.366,0.0,4.85
Trf to Util D/S Separate Lien,268.879,0.0,4.27,335.319,0.0,4.71
Interest payment D/S funds,248.02,0.0,3.94,254.33,0.0,3.57
Trf to Electric CIP Fund,239.286,0.0,3.8,275.644,0.0,3.87
Regular wages - Civil Services,229.244,0.0,3.64,337.507,0.0,4.74
Insurance-health/life/dental,197.294,0.0,3.14,231.929,0.0,3.26
Principal payment D/S funds,192.345,0.0,3.06,200.63,0.0,2.82
Transmission Cost of Service,176.319,0.0,2.8,188.959,0.0,2.65
Medical Claims,176.256,0.0,2.8,206.15,0.0,2.89


*Regular wages - full time* is the top regular expenditure at \$687M with a \$959M budget, and that the budget and expense differences are zeros out to 3 decimal places indicates there's no negative values in either column (at least up to $1k in value). Looking at the percentages of the entire expenditure and budget shows it to be almost 11% and 13.5%. This is followed by *Services-other* with \$274M (4.35%) and *Trf to Util D/S Separate Lien* with \$269M (4.27%).

Here's a plot of the expenditure (in millions of dollars) as a function of expense name with the budget (also in millions of dollars) as the color - but again, we know the budget of some of the logs are defined to be 0, such as the 'Vacation Pay' on row 25 in the table above.

![image](plots/expense_expenseName_budget.png)

### **Q3 Answer: The most costly expenditure is easily *Regular wages - full time* which has a total expenditure amount of `$687M` (11\% of total expenses) vs its `$959M budget` (13.5\% of total budget) - and the subsequent expenses are less than half this.**

## Power BI

Continuing with the use of Microsoft resources, Power BI is used to better visualize the data. I used Power BI Desktop to generate the report, which doesn't allow publishing to a public repository like Tableau does, so screenshots of the full report and its ability to interact with different categories of the data are shown below. The Power BI file itself is included in the same folder as the notebook file as a *.pbix file.

*The original report:*

![image](plots/powerbi_original.png)

*The report showing results with the Austin Energy department selected:*

![image](plots/powerbi_AustinEnergySelected.png)

*The report showing results with the Expense **Regular Wages - Full Time** selected:*

![image](plots/powerbi_RegularWagesFullTimeSelected.png)

*The report showing results with the **Budget = 0, Expenditure > 0** category selected (The Total Budget and Expenses category are blank because they were designed to show results only for the Budget, Expense > 0 category):*

![image](plots/powerbi_BudgetEQ0ExpendGT0Selected.png)