Skip to content
This repository has been archived by the owner on Dec 27, 2020. It is now read-only.

Analyze wage gap for staff in the City of San Jose #23

Open
EngineerEmily opened this issue Sep 19, 2015 · 10 comments
Open

Analyze wage gap for staff in the City of San Jose #23

EngineerEmily opened this issue Sep 19, 2015 · 10 comments

Comments

@EngineerEmily
Copy link
Collaborator

Data of how much city staff salaries are usually public. Can we take that data and see if there's a difference between the wages of men and women?

@evankroske evankroske changed the title Wage gap for staff in the city of San Jose Analyze wage gap for staff in the City of San Jose Sep 20, 2015
@3vivekb
Copy link
Collaborator

3vivekb commented Sep 30, 2015

While Women's equality is an important issue, I don't think it is something Code for San Jose should attempt to study. Yes, wage data is public for all city employees. I doubt it will tell whether employees are male or female. But the results of our analysis shouldn't be publicized since at worst it could jeopardize our relationship with the City of San Jose.

@3vivekb 3vivekb closed this as completed Sep 30, 2015
@EngineerEmily
Copy link
Collaborator Author

I don't think that's a good enough reason to remove the option of someone working on it. If no one wants to work on it, fine. But being afraid that the city would have issues with an analysis of openly accessible data hinders our credibility as a transparent and open organization.

@EngineerEmily EngineerEmily reopened this Sep 30, 2015
@dnahol
Copy link

dnahol commented Feb 24, 2016

Hi. I'd be interested in helping. I have a sociology background and I've done sociological data analysis.

@dnahol
Copy link

dnahol commented Feb 24, 2016

So the main consideration in data analysis of wage gaps is to account for confounding variables (occupation, education, experience). I would use the Blinder-Oaxaca decomposition.
https://cran.r-project.org/web/packages/oaxaca/vignettes/oaxaca.pdf

About the political concerns, I think this project would be in line with the local government's stated goals according to this article: http://www.mercurynews.com/census/ci_28718950/santa-clara-county-scrutinizing-wage-disparities?source=infinite-up

"SAN JOSE -- Santa Clara County leaders are rolling out a plan to tackle a problem that persists in Silicon Valley and across the nation despite federal legislation dating back to the 1960s -- an income disparity between male and female workers, and those of different ethnicities.

Supervisors Dave Cortese and Cindy Chavez are proposing a "Gender and Ethnicity Pay Equity Ordinance" that would scrutinize the county's own payroll, as well as that of companies doing business with the county to make sure compensation is equal for similar work.

"We're saying that we're going to clean our own house, and then we're going to come talk to you about your house," Chavez said."

@evankroske
Copy link
Collaborator

@ying1
Copy link

ying1 commented Feb 24, 2016

I looked at the data and has lots of questions. That said, seems in order
to do any sort of processing to determine gender / education / etc,
requires another data. In addition, there is no hours worked type data
(b/c it is not clear whether someone is part time... )

On Wed, Feb 24, 2016 at 7:42 AM, Evan Kroske notifications@github.com
wrote:

Here's the data for 2015:

http://data.sanjoseca.gov/dataviews/225564/employee-compensation-2015/


Reply to this email directly or view it on GitHub
#23 (comment)
.

@evankroske
Copy link
Collaborator

@ying1, which data set are you talking about, the San Jose data set or the CA data set? If you need more data from San Jose, we can get it for you:

http://codeforsanjose.com/data/

@mthong, one of our co-captains, works at the city.

@dnahol
Copy link

dnahol commented Feb 24, 2016

@ying1, @evankroske , as far as I can tell neither the San Jose or CA data sets for local gov employee compensation have demographic data.

Edit: The U.S. Census Bureau data has been used before to report on general population income data by demographics in Santa Clara County. I haven't looked at it closely enough yet to see if I can zoom in on San Jose. If we do find a comparable data set for just city or even county employees, that would be and interesting comparison. I found a Seattle data set like that.

I did post a data request for it. I'd be super excited if something turned up. Thanks @evankroske

@mthong
Copy link

mthong commented Feb 24, 2016

@ying1, @dnahol - You're correct that there is no demographic data in the public salary dataset, and there is no public dataset that provides the associated demographic information for each individual employee -- many people would consider their age, education level "private" information that does not need to be shared with the public. The purpose of the dataset is to share the salaries paid by public taxpayer money.

Perhaps you can get at the gender by taking a guess based on the names. There seem to be existing libraries available that help you do this: E.g. https://genderize.io/ Something like that seems to me to be your best bet.

As @evankroske suggests, you may be able to find some reports with aggregated demogrpahic data on City of San Jose employees, but I don't think that would help you do the kind of analysis you want to do.

@dnahol
Copy link

dnahol commented Feb 25, 2016

Thank you @mthong! I'll try that tool.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants