### Data Analysis Process

![Data Analysis Process](images/Intro%20to%20Data%20Analytics%20Case%20Study.png)


### Data Understanding

Data sourced from Dallas Open Data Portal, [Vendor Payments for Fiscal Year 2019-Present](https://www.dallasopendata.com/Economy/Vendor-Payments-for-Fiscal-Year-2019-Present/x5ih-idh7)
Latest 500k query.
Resources:
- Questions sourced from [Dallas Open Records](https://dallastx.govqa.us/WEBAPP/_rs/(S(yvwdnfffcg5vrmly43gmvfag))/OpenRecordsSummary.aspx?sSessionID=)

The csv  has 22 columns and 850,000+ rows.
Features include:
(List of columns)


## FIELD NAME DESCRIPTION
-   RUN DATE DATE POSTED
-   FY FISCAL YEAR
-   FM FISCAL MONTH ‐ OCTOBER IS MONTH 1/SEPTEMBER IS MONTH 12/PERIOD 13 IS AN ADJUSTMENT PERIOD AT YEAR END
-   DOC‐ID SYSTEM DOCUMENT ID ‐ READS AS FOLLOWS ‐ DOCUMENT TYPE‐DEPARTMENT‐DOCUMENT NUMBER
-   CHKSUBTOT PAYMENT SUBTOTAL BY LINE ITEM
-   VCODE VENDOR CODE
-   VENDOR VENDOR NAME
-   ZIP5 VENDOR ZIP CODE
-   FTYP FUND TYPE ABBREVIATION ‐ SEE THE 'READ MORE' SECTION OF THIS WEBSITE FOR DESCRIPTIONS
-   FUND TYPE FUND TYPE ‐ SEE THE 'READ MORE' SECTION OF THIS WEBSITE FOR DESCRIPTIONS
-   DPT DEPARTMENT ABBREVIATION
-   DEPARTMENT DEPARTMENT
-   ACTV INTERNAL ACTIVITY CODE
-   ACTIVITY ACTIVITY
-   OGRP OBJECT GROUP ABBREVIATION
-   OBJECTGROUP OBJECT GROUP (SPECIFIES THE EXPENDITURE CLASSIFICATION)
-   OBJ NUMERIC CODE ASSOCIATED WITH THE OBJECT
-   OBJECT DESCRIPTION OF OBJ ‐ OBJECTS ARE PLACED IN TO OBJECT GROUPS FOR CLASSIFICATION, BASED ON THEIR OBJ CODE
-   COMM COMMODITY CODE
-   COMMODITY DSCR DESCRIPTION OF COMM (COMMODITY CODE)
-   INVOICEDATE NOT APPLICABLE
-   INVOICENUMBER NOT APPLICABLE

In [83]:
#Import Libraries

import pandas as pd
import matplotlib.pyplot as plt
import datetime
import numpy as np
import seaborn as sns
import sidetable
#from pandas_profiling import ProfileReport 

In [84]:
#Read in data (could be done from a file(excel/csv), API, link, or SQL dB)
df = pd.read_csv("https://www.dallasopendata.com/resource/x5ih-idh7.csv?$limit=50000")
df.shape
#examine the shape of the data

(50000, 22)

In [85]:
#formatting columns, specifically the chksubtot column
format_dict =  {'chksubtot':'${:,.2f}'}

In [86]:
#returns the first 5 rows of data
df.head()

#returns the last 5 rows of data
#df.tail()

Unnamed: 0,rundate,fy,fm,docid,chksubtot,vcode,vendor,zip5,ftyp,fundtype,...,actv,activity,ogrp,objectgroup,obj,object,comm,commoditydscr,invoicedate,invoicenumber
0,2023-11-01T00:00:00.000,2024,2,AD-BMS-AY240002070,540.63,MVORM001,SAMUEL RAYMOND MERRIFIELD AND THE KAR STORE,,OTHO,Other Operating Fund,...,RM12,Claims For Injury/Damage To Personal Property,QV,Contractual & Other Services,3521,Judgements & Damages,,NONE,,
1,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010158,667.44,VS0000073028,FORTILINE INC.,75284.0,EFOP,Enterprise Operating Fund,...,DW01,Water Treatment,QM,Supplies & Materials,2998,Inventory Purchase,67070.0,"VALVES, BRONZE: ANGLE, BALL, CHECK, GATE, GLOB...",,
2,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240009859,473.88,331904,"M A N S DISTRIBUTORS, INC",75006.0,GNFD,General Fund,...,ST10,In-House Preservation,QM,Supplies & Materials,2220,Laundry & Cleaning Suppl,48500.0,"JANITORIAL SUPPLIES, GENERAL LINE",,
3,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240009872,561.98,399372,THE AROUND THE CLOCK FREIGHTLINER GROUP LLC,75247.0,INSV,Internal Service Fund,...,EQ04,City Fleet Maintenance And Repair Services,QV,Contractual & Other Services,3110,Repair&Maint Serv Equip,6000.0,AUTOMOTIVE AND TRAILER MAINTENANCE ITEMS AND R...,,
4,2023-11-01T00:00:00.000,2024,2,AD-BMS-AY240002123,15207.5,VS0000071772,Carter Arnett PLLC,75206.0,OTHO,Other Operating Fund,...,RM01,Risk Management Services,QV,Contractual & Other Services,3033,Legal Fees,96149.0,"LEGAL SERVICES, ATTORNEYS",,


### Exploratory Data Analysis

In [87]:
df.dtypes
#examine the data data types

rundate           object
fy                 int64
fm                 int64
docid             object
chksubtot        float64
vcode             object
vendor            object
zip5              object
ftyp              object
fundtype          object
dpt               object
department        object
actv              object
activity          object
ogrp              object
objectgroup       object
obj               object
object            object
comm             float64
commoditydscr     object
invoicedate      float64
invoicenumber    float64
dtype: object

In [None]:
df.info()
#examine the info for the data

In [89]:
df.describe()
#describe the data

Unnamed: 0,fy,fm,chksubtot,comm,invoicedate,invoicenumber
count,50000.0,50000.0,50000.0,42994.0,0.0,0.0
mean,2023.2516,7.72722,18293.59,69524.431758,,
std,0.433937,4.093281,148247.8,34414.145995,,
min,2023.0,1.0,-167746.4,505.0,,
25%,2023.0,2.0,253.92,42044.0,,
50%,2023.0,9.0,944.79,91200.0,,
75%,2024.0,11.0,4081.275,96224.0,,
max,2024.0,12.0,9089393.0,99231.0,,


In [90]:
df.department.value_counts()
# here we are looking at the number of times a dept is represented

Water Utilities                             9695
Park & Recreation                           5944
Management Services                         5260
Inventory Purchase                          5223
Aviation                                    2360
Equipment & Fleet Management                2083
Dallas Police Dept                          1986
Public Works & Transporation                1849
Sanitation Svcs                             1616
Office Of Cultural Affairs                  1532
Building Services                           1397
Dallas Fire Department                      1385
Stormwater Drainage Management              1072
Code Compliance                             1015
Communication & Info Svcs                    990
Risk Management                              933
Library                                      784
Transportation                               777
Dallas Animal Services                       679
Sustainable Development and Construction     409
Housing/Community Se

In [91]:
#number  of null values
df.isnull().sum()

#percentage of null values
#df.isnull().mean()



rundate          0.00000
fy               0.00000
fm               0.00000
docid            0.00000
chksubtot        0.00000
vcode            0.00000
vendor           0.00000
zip5             0.01176
ftyp             0.00000
fundtype         0.00000
dpt              0.00000
department       0.00000
actv             0.00000
activity         0.00000
ogrp             0.00000
objectgroup      0.00000
obj              0.00000
object           0.00000
comm             0.14012
commoditydscr    0.00000
invoicedate      1.00000
invoicenumber    1.00000
dtype: float64

In [94]:
df['department'] = df.department.astype('string')
df.department.dtypes

StringDtype

#### Which department has the most vendor payouts for 2024 and what is its average payout amount?

In [104]:
format_dict = {'chksubtot':'${:,.0f}'}


In [105]:
#fy 2024
fy_24 = df[df.fy ==2024]
fy_24.style.format(format_dict)
fy_24.stb.freq(['department'], value='chksubtot', style=True, cum_cols=False)

Unnamed: 0,department,chksubtot,percent
0,Water Utilities,189150241,25.90%
1,Public Works & Transporation,116425382,15.94%
2,Aviation,44700953,6.12%
3,Park & Recreation,40307616,5.52%
4,Management Services,37123811,5.08%
5,Office Of Economic Development,35927205,4.92%
6,Convention And Event Services,35586155,4.87%
7,Communication & Info Svcs,35506174,4.86%
8,Inventory Purchase,26182335,3.59%
9,Sanitation Svcs,22746831,3.11%


#### Question 2 : AVI Vendors
##### Which vendor recieved the most overall pay out from the aviation department?



In [109]:
avi_dept = df[df.department =='Aviation']
avi_dept.style.format(format_dict)
avi_dept.head()

Unnamed: 0,rundate,fy,fm,docid,chksubtot,vcode,vendor,zip5,ftyp,fundtype,...,objectgroup,obj,object,comm,commoditydscr,invoicedate,invoicenumber,cm,date_combo,fy_q
81,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240009938,106.86,VC18843,Alan Mathew,75048,EFOP,Enterprise Operating Fund,...,Contractual & Other Services,3361,Professional Development,,NONE,,,11,2024-11-01,2025Q1
200,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010089,24744.45,VS0000031592,"APPLIED PAVEMENT TECHNOLOGY, INC.",53597,AVIC,Aviation Capital Program,...,Capital Outlay,4111,Engineering Design,84549.0,PAVEMENT TESTING AND DATA COLLECTION EQUIPMENT,,,11,2024-11-01,2025Q1
229,2023-11-01T00:00:00.000,2024,2,AD-BMS-AY240002126,110480.1,VS88417,"Hi-Lite Airfield Services, LLC",13601,EFOP,Enterprise Operating Fund,...,Contractual & Other Services,3210,Repair&Maint Ser Bldg Etc,91364.0,"MAINTENANCE AND REPAIR: AIRPORT ROADWAY, RUNWA...",,,11,2024-11-01,2025Q1
236,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010097,10035.0,VS0000051722,GRESHAM SMITH,37230,EFOP,Enterprise Operating Fund,...,Contractual & Other Services,3070,Professional Services,92642.0,ENVIRONMENTAL SERVICES (NOT OTHERWISE CLASSIFIED),,,11,2024-11-01,2025Q1
321,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240009887,2489.2,500437,WINSTON WATER COOLER LTD,75373,EFOP,Enterprise Operating Fund,...,Supplies & Materials,2998,Inventory Purchase,67000.0,"PLUMBING EQUIPMENT, FIXTURES, AND SUPPLIES",,,11,2024-11-01,2025Q1


In [111]:
avi_dept.shape

(2360, 25)

In [110]:
#first, lets isolate the Aviation dept, vendors from the rest the data
aviation_vendors_sum = avi_dept.groupby(['vendor'])['chksubtot'].agg(['sum']).sort_values(by=['sum'],ascending=False)
aviation_vendors_sum

Unnamed: 0_level_0,sum
vendor,Unnamed: 1_level_1
"Flatiron Constructors, Inc.",20848125.89
"PARKING CONCEPTS, INC.",2103551.70
"DLF Denton, LLC",1808562.38
"Member's Building Maintenance, LLC.",1754570.15
HNTB CORPORATION,1492220.82
...,...
HOWARD CARTER,38.40
Hayes Hodges,33.30
THOMAS REPROGRAPHICS INC,24.00
Quick Acquisition LLC,0.00


In [113]:
#contains_clear, searching for vendor payout to "Flatiron Contractors" 
avi_dept.loc[avi_dept['vendor'].str.contains('Flatiron', case=False)]

Unnamed: 0,rundate,fy,fm,docid,chksubtot,vcode,vendor,zip5,ftyp,fundtype,...,objectgroup,obj,object,comm,commoditydscr,invoicedate,invoicenumber,cm,date_combo,fy_q
6845,2023-10-17T00:00:00.000,2024,1,EFT-BMS-EY240004669,2107853.49,VS96112,"Flatiron Constructors, Inc.",80021,AVRC,Aviation Capital Program.commercial paper,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,10,2024-10-01,2025Q1
16002,2023-09-19T00:00:00.000,2023,12,EFT-BMS-EY230054757,1365757.17,VS96112,"Flatiron Constructors, Inc.",80021,AVRC,Aviation Capital Program.commercial paper,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,9,2023-09-01,2023Q4
16290,2023-09-19T00:00:00.000,2023,12,EFT-BMS-EY230054757,2717642.92,VS96112,"Flatiron Constructors, Inc.",80021,AVIC,Aviation Capital Program,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,9,2023-09-01,2023Q4
23262,2023-08-24T00:00:00.000,2023,11,EFT-BMS-EY230050549,4201637.47,VS96112,"Flatiron Constructors, Inc.",80021,AVIC,Aviation Capital Program,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,8,2023-08-01,2023Q4
23940,2023-08-22T00:00:00.000,2023,11,EFT-BMS-EY230050091,163920.47,VS96112,"Flatiron Constructors, Inc.",80021,AVRC,Aviation Capital Program.commercial paper,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,8,2023-08-01,2023Q4
30682,2023-07-21T00:00:00.000,2023,10,EFT-BMS-EY230045886,1725487.63,VS96112,"Flatiron Constructors, Inc.",80021,AVIC,Aviation Capital Program,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,7,2023-07-01,2023Q4
35832,2023-06-27T00:00:00.000,2023,9,EFT-BMS-EY230042747,2632196.39,VS96112,"Flatiron Constructors, Inc.",80021,AVIC,Aviation Capital Program,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,6,2023-06-01,2023Q3
40297,2023-06-07T00:00:00.000,2023,9,EFT-BMS-EY230040238,35322.07,VS96112,"Flatiron Constructors, Inc.",80021,AVRC,Aviation Capital Program.commercial paper,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,6,2023-06-01,2023Q3
41383,2023-06-01T00:00:00.000,2023,9,EFT-BMS-EY230039470,3816070.59,VS96112,"Flatiron Constructors, Inc.",80021,AVIC,Aviation Capital Program,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,6,2023-06-01,2023Q3
48129,2023-05-01T00:00:00.000,2023,8,EFT-BMS-EY230035256,2082237.69,VS96112,"Flatiron Constructors, Inc.",80021,AVIC,Aviation Capital Program,...,Capital Outlay,4599,Other Improvements-other than Building,92500.0,"ENGINEERING SERVICES, PROFESSIONAL",,,5,2023-05-01,2023Q3


## Practice ##

Using the previous excercise, perform the following: 
- Create a dataframe for the purchases by the Fire Department
- Find out who the top vendors are
- How much was the most recent payout was for

In [37]:
#Create a dataframe for the purchases by the Fire Department
fire_dept =  df[df.department =='Dallas Fire Department']
fire_dept.style.format(format_dict)
fire_dept.head()

Unnamed: 0,rundate,fy,fm,docid,chksubtot,vcode,vendor,zip5,ftyp,fundtype,...,objectgroup,obj,object,comm,commoditydscr,invoicedate,invoicenumber,cm,date_combo,fy_q
44,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240009871,224.0,359345,RECOVERY SYSTEMS INC,75067.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7095.0,WRECKERS,,,11,2024-11-01,2025Q1
45,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010229,1355.0,507070,SUNBELT RENTALS INC.,30384.0,GNFD,General Fund,...,Contractual & Other Services,3060,Equip Rntl (Outside City),98100.0,RENTAL OR LEASE OF EQUIPMENT - GENERAL EQUIPMENT,,,11,2024-11-01,2025Q1
118,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010230,442.64,VC21411,Hydraulic Hose of Love Field LLC,75027.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,6060.0,"HOSE AND HOSE FITTINGS: BRAKE, HEATER, RADIATO...",,,11,2024-11-01,2025Q1
232,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010185,2856.7,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7003.0,AMBULANCES AND RESCUE VEHICLES,,,11,2024-11-01,2025Q1
265,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240009926,1272.3,VC14210,"Safeware, Inc",20706.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,93608.0,"AIR COMPRESSORS AND ACCESSORIES, MAINTENANCE A...",,,11,2024-11-01,2025Q1


In [38]:
fire_dept.shape

(188, 25)

In [114]:
#Find out who the top vendors are

fire_vendors =fire_dept.groupby(['vendor'])['chksubtot'].agg(['sum']).sort_values(by=['sum'],ascending=False)
fire_vendors


Unnamed: 0_level_0,sum
vendor,Unnamed: 1_level_1
"Siddons Martin Emergency Group, LLC",2554550.90
Digitech Computer LLC,547732.61
Public Consulting Group LLC,459278.10
SAM PACK'S FIVE STAR FORD,166811.50
TEXAS COMMISSION ON FIRE,112740.00
...,...
JACQUELINE CAMPER,196.84
"Charter Communications Holdings, LLC",159.52
MCSHAN FLORIST,96.35
FEDERAL EXPRESS CORP,42.58


In [116]:
#How much was the most recent payout was for
fire_dept.loc[fire_dept['vendor'].str.contains('Siddons', case=False)]

Unnamed: 0,rundate,fy,fm,docid,chksubtot,vcode,vendor,zip5,ftyp,fundtype,...,objectgroup,obj,object,comm,commoditydscr,invoicedate,invoicenumber,cm,date_combo,fy_q
232,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010185,2856.7,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7003.0,AMBULANCES AND RESCUE VEHICLES,,,11,2024-11-01,2025Q1
288,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010183,2665.0,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3085,Freight,6000.0,AUTOMOTIVE AND TRAILER MAINTENANCE ITEMS AND R...,,,11,2024-11-01,2025Q1
534,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010181,700.0,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3085,Freight,6000.0,AUTOMOTIVE AND TRAILER MAINTENANCE ITEMS AND R...,,,11,2024-11-01,2025Q1
667,2023-10-31T00:00:00.000,2024,1,EFT-BMS-EY240009690,75.0,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3085,Freight,6000.0,AUTOMOTIVE AND TRAILER MAINTENANCE ITEMS AND R...,,,10,2024-10-01,2025Q1
778,2023-10-31T00:00:00.000,2024,1,EFT-BMS-EY240009688,7557.69,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7003.0,AMBULANCES AND RESCUE VEHICLES,,,10,2024-10-01,2025Q1
844,2023-10-31T00:00:00.000,2024,1,EFT-BMS-EY240009689,1431.99,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7003.0,AMBULANCES AND RESCUE VEHICLES,,,10,2024-10-01,2025Q1
905,2023-10-31T00:00:00.000,2024,1,EFT-BMS-EY240009687,831610.0,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,OTDB,Other GO CIP - Debt,...,Capital Outlay,4742,Trucks,7003.0,AMBULANCES AND RESCUE VEHICLES,,,10,2024-10-01,2025Q1
968,2023-10-31T00:00:00.000,2024,1,EFT-BMS-EY240009686,2235.3,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7003.0,AMBULANCES AND RESCUE VEHICLES,,,10,2024-10-01,2025Q1
1319,2023-10-30T00:00:00.000,2024,1,EFT-BMS-EY240009150,250.0,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3085,Freight,6000.0,AUTOMOTIVE AND TRAILER MAINTENANCE ITEMS AND R...,,,10,2024-10-01,2025Q1
1364,2023-10-30T00:00:00.000,2024,1,EFT-BMS-EY240009052,1149.0,VS90252,"Siddons Martin Emergency Group, LLC",77032.0,GNFD,General Fund,...,Contractual & Other Services,3110,Repair&Maint Serv Equip,7003.0,AMBULANCES AND RESCUE VEHICLES,,,10,2024-10-01,2025Q1


In [None]:
#Find out who the top vendors are

fire_vendors =fire_dept.groupby(['vendor'])['chksubtot'].agg(['sum']).sort_values(by=['sum'],ascending=False)
fire_vendors

#Create a dataframe for the purchases by the Fire Department
fire_dept =  df[df.department =='Dallas Fire Department']
fire_dept.style.format(format_dict)
fire_dept.head()

#How much was the most recent payout was for
fire_dept.loc[fire_dept['vendor'].str.contains('Siddons', case=False)]

####  You recieve a request from a resident who inquires the following via email: ####
    "I am requesting an opportunity to inspect or obtain copies of public records for the contract - Temporary Staffing. The details I am requesting are given below: Proposals of the awarded vendors. Spending on this contract till now."
    - Citizen Candy

In [117]:
temp_staff_contract = df[df.object =='Outside Temps/Staffing']
temp_staff_contract.head()

Unnamed: 0,rundate,fy,fm,docid,chksubtot,vcode,vendor,zip5,ftyp,fundtype,...,objectgroup,obj,object,comm,commoditydscr,invoicedate,invoicenumber,cm,date_combo,fy_q
32,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010128,1105.2,VS0000066539,"SMITH TEMPORARIES, INC",75201,GNFD,General Fund,...,Contractual & Other Services,3994,Outside Temps/Staffing,96269.0,"PERSONNEL SERVICES, TEMPORARY",,,11,2024-11-01,2025Q1
40,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010133,1105.2,VS0000066539,"SMITH TEMPORARIES, INC",75201,CVID,COVID-19,...,Contractual & Other Services,3994,Outside Temps/Staffing,96269.0,"PERSONNEL SERVICES, TEMPORARY",,,11,2024-11-01,2025Q1
42,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010145,320.68,VS0000066539,"SMITH TEMPORARIES, INC",75201,GNFD,General Fund,...,Contractual & Other Services,3994,Outside Temps/Staffing,96269.0,"PERSONNEL SERVICES, TEMPORARY",,,11,2024-11-01,2025Q1
47,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010123,1464.48,VS0000066539,"SMITH TEMPORARIES, INC",75201,GNFD,General Fund,...,Contractual & Other Services,3994,Outside Temps/Staffing,96269.0,"PERSONNEL SERVICES, TEMPORARY",,,11,2024-11-01,2025Q1
61,2023-11-01T00:00:00.000,2024,2,EFT-BMS-EY240010154,884.16,VS0000066539,"SMITH TEMPORARIES, INC",75201,GNFD,General Fund,...,Contractual & Other Services,3994,Outside Temps/Staffing,96269.0,"PERSONNEL SERVICES, TEMPORARY",,,11,2024-11-01,2025Q1


In [42]:
temp_staff_contract.shape

(758, 25)

In [118]:
#temp contract by year and  vendor
temp_staff_contract.groupby(['vendor','fy'])['chksubtot'].agg(['sum']).sort_values(by=['vendor','sum'],ascending=False)

Unnamed: 0_level_0,Unnamed: 1_level_0,sum
vendor,fy,Unnamed: 2_level_1
"SMITH TEMPORARIES, INC",2023,3245498.77
"SMITH TEMPORARIES, INC",2024,1103467.57
Rushmore Corporation,2023,520631.52
Rushmore Corporation,2024,98594.89
Lincoln Leadership Advisors LLC,2023,0.0
"GTS Technology Solutions, Inc",2023,551127.34
"GTS Technology Solutions, Inc",2024,262376.8
"BGSF Professional, LLC",2023,115487.5
"BGSF Professional, LLC",2024,46472.5
"22ND CENTURY TECHNOLOGIES, INC.",2023,2286419.89


### Modeling (optional)

In [None]:
df.columns

In [None]:
#vizualizing payout by fy/$
df.groupby('fy_q')['chksubtot'].mean().plot(kind='line',color= 'C1')
plt.title("Mean Vendor Payout($) by FY", color= 'C1')
plt.xlabel(" Q",color= 'C1')
plt.ylabel("Mean Check paid ($)", color= 'C1')
plt.show()

In [None]:
fire_dept.corr()

### Evaluation( Data Presentation/Validation--how to present data to stakeholder)

In [126]:
#Create Data Profile on Fire Department vendors/contracts only
format_dict = {'%Missing': '{:.2%}'}

#fire_dept cols to list
col_nam = fire_dept.columns.tolist()
#create new columns
df1_clean = pd.DataFrame(index = col_nam,columns=['Column Description', 'Populated', 'Unique', 'Missing','%Missing','ActualType', 'Minimum', 'Median', 'Maximum' 'DataType'])

#covert the data types to a list
df1d = fire_dept.convert_dtypes()

#fill new columns with data
df1_clean['Populated'] = fire_dept.count()
df1_clean['Unique'] = fire_dept.nunique()
df1_clean['Missing'] = fire_dept.isnull().sum()
df1_clean['%Missing'] = df1_clean['Missing']/df1_clean['Populated']
df1_clean['ActualType'] = df1d.dtypes
df1_clean['Minimum'] = fire_dept.min()
df1_clean['Maximum'] = fire_dept.max()
df1_clean['Median'] = fire_dept.median()

df1_clean.style.format(format_dict)

  df1_clean['Median'] = fire_dept.median()


Unnamed: 0,Column Description,Populated,Unique,Missing,%Missing,ActualType,Minimum,Median,MaximumDataType,Maximum
rundate,,188,17,0,0.00%,string,2023-10-10T00:00:00.000,,,2023-11-01T00:00:00.000
fy,,188,1,0,0.00%,Int64,2024,2024.0,,2024
fm,,188,2,0,0.00%,Int64,1,1.0,,2
docid,,188,184,0,0.00%,string,AD-BMS-AY240000593,,,EFT-BMS-EY240010230
chksubtot,,188,146,0,0.00%,float64,3.460000,968.625,,831610.000000
vcode,,188,72,0,0.00%,string,032038,,,VS97753
vendor,,188,72,0,0.00%,string,"ABventure DeSign, LLC",,,WILLIAMS SCOTSMAN INC
zip5,,188,66,0,0.00%,Int64,0.000000,75093.5,,92562.000000
ftyp,,188,2,0,0.00%,string,GNFD,,,OTDB
fundtype,,188,2,0,0.00%,string,General Fund,,,Other GO CIP - Debt


In [130]:
# export for excel data profile
file_name = "fire_dept_Data_Profile.xlsx"
with pd.ExcelWriter(file_name) as writer:
    # writing to the 'Employee' sheet
    fire_dept.to_excel(writer, sheet_name='Fire_Profile', index=False)



### Deployment( Visualization & Presentation)