# City of Seattle Staff Demographic Analysis
#### Navya Mangipudi, Big Data & Analytics, 9-17-19

## Abstract:
This lab is an analysis of both wage and age of City of Seattle employees. In this report, we will be finding the median, mode, mean, minimum and maximum values for both wage and age in the data set. We will also be finding the sum of all hourly wages from the data set. 



## Dataset Preparation

This dataset represents the demographics of Seattle City Staff. This includes race, sex, department, age, wage, whether the employee is temporary or regular and, the status of the employee. For this particular analysis, I only require age and wage data.

* First, I put all of the lines in the data file into a list. 
* Then, I deleted the descriptors above the data from the dataset. 
* Then, I made two empty lists; one for wage and one for age. 
* I then split every piece of data with commas and turned it into a list. 
* Lastly, I put the wage and age data into their respective lists and made sure to turn them into integers (age) and floating points (wage) in order to avoid the data being turned into a string and therefore restricting me from performing necessaary calculations. 

This dataset is accessible at link: https://data.seattle.gov/City-Business/City-of-Seattle-Staff-Demographics/5avq-r9hj

This data is collected by the Seattle Department of Human Resources and, since the data collection is funded by tax dollars, the information is available to the public. The data is updated monthly, leading to accurate information and has 7 rows and over 14.4k columns. 

In [6]:
import statistics

staff_file = open("City_of_Seattle_Staff_Demographics.csv", "r");

staff_data = [];

for row in staff_file:
    staff_data.append(row);

    
del staff_data[0];

age = [];
wage = [];

for item in range(len(staff_data)):
    file_row = staff_data[item].split(",")
    age.append(int(file_row[3]))
    wage.append(float(file_row[4]))
    
staff_file.close();

## Data Modeling

* Here, I used code in order to calculate the minimum, maximum, mode, median, and mean data points for both age and wage. I used the min, max and sum functions and, with the import statistics module, I also was able to calculate the mode, mean and median. 
* Uniquely, I also calculated the hourly cost of all Seattle City staff wages in order to determine how much it costs the city on a per/hour basis to pay its employees.  

In [7]:
minage = min(age) 
minwage = min(wage)
modewage = statistics.mode(wage)
modeage = statistics.mode(age)
maxage = max(age)
maxwage = max(wage)
meanage = statistics.mean(age)
meanwage = statistics.mean(wage)
medage = statistics.median(age)
medwage = statistics.median(wage)
totalhrlywage = sum(wage)


### Wage Data Calculations:

In [8]:
print("Wage statistics:")
print("Minumum wage:", minwage)
print("Mode wage:", modewage)
print("Maximum wage:", maxwage)
print("Mean wage:", meanwage) 
print("Median wage:", medwage)
print("Sum of all hourly wages:", totalhrlywage)

Wage statistics:
Minumum wage: 5.53
Mode wage: 16.11
Maximum wage: 162.8353
Mean wage: 40.03381942370015
Median wage: 39.02
Sum of all hourly wages: 568240.0329000067


### Age Data Calculations

In [9]:
print("Age Statistics:")
print("Minumum age:", minage)
print("Mode age:", modeage)
print("Maximum age:", maxage)
print("Mean age:", meanage)
print("Median age:", medage)

Age Statistics:
Minumum age: 14
Mode age: 48
Maximum age: 92
Mean age: 44.53395801042694
Median age: 45.0


## Data Analysis & Conclusion

### Wage Data:

* When analyzing the wage data, we can conclude that the minimum wage for the staff of the City of Seattle is 5.53 dollars/hr. This is far less than I initially expected, mainly due to the fact that the Seattle overall minimum wage is 12.00 dollars/hr. This is an area for further exploration in order to find the age, gender and department of this individual.
* We can also conclude that the mode wage is 16.11 dollars/hr which means that this is the most common wage among the Seattle City Staff. 
* The maximum wage for the staff of the City of Seattle is 162.84 dollars/hr, a number that makes me wonder what department/job the individual is working at and, the gender and age of the individual. 
* We can also conclude that the mean wage of a Seattle City employee is 40 dollars/hr, a relatively high number in comparison to both the minimum wage (5.53) as well as the minimum Seattle wage (12.00/hr). 
* The median wage, as concluded from the calculations is ~39.00 dollars/hr, a number that is extremely close to the mean wage. This makes me wonder if many or most of the Seattle City Staff wages are around the range of ~35-40 dollars/hr in order for the median and the mean to be so similar. 
* We can also conclude that the hourly cost for the City of Seattle to pay staff is around 568240.03 dollars. This makes me wonder how much the City of Seattle spends in a whole day, assuming that the average work day would last ~8 hours. 


In [10]:
totaldaywage = totalhrlywage * 8
print("Assuming the average work day is 8 hours, the City of Seattle would spend on average", totaldaywage, "dollars daily on worker pay.")

Assuming the average work day is 8 hours, the City of Seattle would spend on average 4545920.263200054 dollars daily on worker pay.


### Age Data:

* When analyzing the wage data, we can conclude that the minimum age for the staff of the City of Seattle is 14 years. This makes me incredibly curious and, the nature of the work of this 14 year old could be an area for future exploration.
* We can also conclude that the mode wage is 48 which means that this is the most common age among the Seattle City Staff.
* The maximum wage for the staff of the City of Seattle is 92, an age far past retirement age. This makes me wonder how much this indivual is being paid and what type of work they are doing. 
* We can also conclude that the mean age of a Seattle City employee is 44, a number close to the mode of the data set. This makes me wonder if the age majority of the City of Seattle staff is concentrated in the 40s.
* The median age, as concluded is 45, a number that, again, is incredibly close to the mode and the mean, furthering the idea that a large part of the Seattle City work force is in the range of 40 - 50 years old. 

### Questions: 

* Due to the fact that the minimum age is 14, this makes me wonder what position is held by the individual and, it also makes me wonder what their hours/week are. This, along with wage and position data for other youth under 18 is an area for future exploration within this dataset. 
* The calculation for the sum of all wages makes me wonder whether there has been fluxuation with the cost of wages for the City of Seattle and whether this has either been accompanied by either more/less jobs or higher/lower wages. 
* Another area for further exploration is finding out the statistics for percentage of male vs. female vs. other workers within the Seattle City Staff and what departments the different genders are concentrated. 
* I would also like to find out race statistics and figure out how diverse the Seattle City workforce is and whether or not it reflects the demographics and diversity of Seattle as a whole. 

## Acknowledgements

* Used data from Ms. Sconyer's initial Jupyter Notebooks file in order to extract and put the age and wage data into the two separate lists. I also used Ms. Sconyers for advice on my dataset preparation and modeling sections. 
* I also acknowledge Seattle Department of Human Resources since without their data, I could not have made these calculations or modelled this data. 

