
## UK Gender Pay Gap


### Description:
Employers with 250 or more employees in UK had to publish and report specific figures about their gender pay gap. The gender pay gap is the difference between the average earnings of men and women, expressed relative to men’s earnings. For example, ‘women earn 15% less than men per hour’.

I have also included the UK industry codes (aka SIC codes) in order to allow industry-based analysis.

### Source:
https://www.kaggle.com/linavrgd/uk-pay-gap-data-2018

### Data Dictionary

| Variable                  | Definition                                                             | Type    |
|:--------------------------|----------------------------------------------------------------------  |---------|
| EmployerName              | Company registration name                                              | String  |
| Address                   | Adress, city, zip code                                                 | String  |
| CompanyNumber             | Company registration number in the UK                                  | String  |
| SicCodes                  | Industry codes in the UK                                               | String  |
| DiffMeanHourlyPercent     | Percentage of mean difference in hourly payment                        | Float   |
| DiffMedianHourlyPercent   | Percentage of median difference in hourly payment                      | Float   |
| DiffMeanBonusPercent      | Percentage of mean difference in bonus                                 | Float   |
| DiffMedianBonusPercent    | Percentage of median difference in bonus                               | Float   |
| MaleBonusPercent          | Percentage of male employees earning bonus                             | Float   |
| FemaleBonusPercent        | Percentage of female employees earning bonus                           | Float   |
| MaleLowerQuartile         | Percentage of male employees in the lower payment quartile             | Float   |
| FemaleLowerQuartile       | Percentage of female employees in the lower payment quartile           | Float   |
| MaleLowerMiddleQuartile   | Percentage of male employees in the middle to lower payment quartile   | Float   |
| FemaleLowerMiddleQuartile | Percentage of female employees in the middle to lower payment quartile | Float   |
| MaleUpperMiddleQuartile   | Percentage of male employees in the Upper to middle payment quartile   | Float   |
| FemaleUpperMiddleQuartile | Percentage of female employees in the Upper to middle payment quartile | Float   |
| MaleTopQuartile           | Percentage of male employees in the top payment quartile               | Float   |
| FemaleTopQuartile         | Percentage of female employees in the top payment quartile             | Float   |
| CompanyLinkToGPGInfo      | Link to company's summary report                                       | String  |
| ResponsiblePerson         | Point of contact                                                       | String  |
| Employer Size             | Size of the company in UK                                              | String  |
| CurrentName               | Current version of company's name                                      | String  |
| SubmittedAfterTheDeadline | Indicates whether the data were submitted before or after the deadline | Boolean |

### Initial Goals:
* Perform an exploratory analysis to understand:
  * The distribution of the hourly and bonus pay-gap across industries, location, and company sizes.
  * The ratio of male Vs female participation in eacg payment quartile across industries, location, and company sizes.
* Create a prediction model that identifies the most important predictor variables and estimates potential pay-gap based on these predictors. 
  * **Current predictor variables**: - Industry, location, company size.
  * **Potential predictor variables** *(to be added on the dataset)*: Ownership Status (Public vs Private), participation in FTS100 & 250.
  

#### Other Detasets to be explored:
* List of all companies trading in London Stock Exchange: http://www.londonstockexchange.com/statistics/companies-and-issuers/companies-and-issuers.htm
* List of all companies operating in UK: http://download.companieshouse.gov.uk/en_output.html


*Note: https://beta.companieshouse.gov.uk/search?q=

In [29]:
import pandas as pd

In [32]:
df = pd.read_csv('Final_Project/UK_Gender_PayGap_2018.csv')

In [33]:
df.head(5)

Unnamed: 0,EmployerName,Address,CompanyNumber,SicCodes,DiffMeanHourlyPercent,DiffMedianHourlyPercent,DiffMeanBonusPercent,DiffMedianBonusPercent,MaleBonusPercent,FemaleBonusPercent,...,FemaleLowerMiddleQuartile,MaleUpperMiddleQuartile,FemaleUpperMiddleQuartile,MaleTopQuartile,FemaleTopQuartile,CompanyLinkToGPGInfo,ResponsiblePerson,EmployerSize,CurrentName,SubmittedAfterTheDeadline
0,"""Bryanston School"",Incorporated","Bryanston House,\r\nBlandford,\r\nDorset,\r\nU...",00226143,85310,18.0,28.2,0.0,0.0,0.0,0.0,...,49.2,49.2,50.8,51.5,48.5,https://www.bryanston.co.uk/employment,Nick McRobb (Bursar and Clerk to the Governors),500 to 999,"""Bryanston School"",Incorporated",False
1,"""RED BAND"" CHEMICAL COMPANY, LIMITED","19, Smith's Place,\r\nLeith Walk,\r\nEdinburgh...",SC016876,47730,2.3,-2.7,15.0,37.5,15.6,66.7,...,74.6,10.3,89.7,18.1,81.9,,Philip Galt (Managing Director),250 to 499,"""RED BAND"" CHEMICAL COMPANY, LIMITED",False
2,118 LIMITED,"Fusion Point,\r\nDumballs Road,\r\nCardiff,\r\...",03951948,61900,1.7,2.8,13.1,13.6,70.0,57.0,...,47.0,50.0,50.0,58.0,42.0,,"Emma Crowe (VP, Human Resources)",500 to 999,118 LIMITED,False
3,1610 LIMITED,"Hestercombe House,\r\nCheddon Fitzpaine,\r\nTa...",06727055,93110,-22.0,-34.0,-47.0,-67.0,25.0,75.0,...,48.0,30.0,70.0,24.0,76.0,https://www.1610.org.uk/gender-pay-gap/,Tim Nightingale (CEO),250 to 499,1610 LIMITED,True
4,1879 EVENTS MANAGEMENT LIMITED,"The Sunderland Stadium Of Light,,\r\nSunderlan...",07743495,"56210,\r\n70229",13.4,8.1,41.4,43.7,8.7,3.2,...,50.6,22.8,77.2,58.2,41.8,https://www.safc.com/news/club-news/2018/march...,Jo Graham (Deputy HR Manager),250 to 499,1879 EVENTS MANAGEMENT LIMITED,False
