## Judges Exploration
This notebook explores the judges data in `judges_clean.csv`.

In [1]:
import pandas as pd

In [2]:
pd.set_option('display.max_rows', 50)

In [None]:
pd.reset_option('all')

### Setup
Import all `pd.dataFrame` and look at the `shape` of our data and some rows with `head()` and get a feel for what values our data takes using `value_counts()`.

In [4]:
judges_df = pd.read_csv('../data/judges_clean.csv')
states_df = pd.read_csv('../data/keys/cases_state_key.csv')
districts_df = pd.read_csv('../data/keys/cases_district_key.csv')
courts_df = pd.read_csv('../data/keys/cases_court_key.csv')

In [5]:
judges_df['judge_position'].value_counts()  # looking at position data

district and sessions court                       20286
chief judicial magistrate                         14373
civil judge senior division                       13632
civil judge junior division                        9918
civil court                                        5416
                                                  ...  
j.m 1st class cum addl. munsif-ii                     1
presiding officer ftc i                               1
v-addl.munsif- iii                                    1
adhoc addl. distt. and sess. judge - ii               1
court no. 7 of j.m 1st class, muzaffarpur east        1
Name: judge_position, Length: 565, dtype: int64

In [6]:
judges_df['female_judge'].value_counts()  # looking at gender data

0 nonfemale      67540
1 female         27202
-9998 unclear     3735
Name: female_judge, dtype: int64

### First Topic
I will be comparing the judge positions based on gender

### Some analysis (bad anal)
There are a lot more male than female judges and this makes the comparison hard,
there are more `judcial magistrate` female judges and far less `district and sessions` female judges

In [7]:
male_filt = judges_df['female_judge'] == '0 nonfemale'
female_filt = judges_df['female_judge'] == '1 female'

In [8]:
judges_df[female_filt]['judge_position'].value_counts()

chief judicial magistrate      5130
civil judge senior division    4478
district and sessions court    3499
civil judge junior division    3448
civil court                    1444
                               ... 
sub judge-iv                      1
sub judge viii                    1
sub-judge 3rd                     1
munsif 1st                        1
5-additional district judge       1
Name: judge_position, Length: 234, dtype: int64

In [9]:
judges_df[male_filt]['judge_position'].value_counts()

district and sessions court                 16272
chief judicial magistrate                    9094
civil judge senior division                  8960
civil judge junior division                  6278
civil court                                  3201
                                            ...  
a.d.j. x                                        1
5-judicial magistrate court                     1
5-munsiff                                       1
v-j.m. ist class-cum-addl. munsif               1
3-additional civil judge senior division        1
Name: judge_position, Length: 546, dtype: int64

### More bad anal
We can identify the ratio of men to women judges for each state. My hypothesis is, since south India is generally more progressive, the ratio of women to men in southern states will be higher than in northern states.

#### Results
From the analysis, it would seem I'm not entirely accurate about southern states having a higher ratio but I am accurate about northen states having lower ratio. Moreover, an interesting observation is that all states with ratio 0.5 or greater are densely concentrated towards the east.

The array has elements of the form `(female_percentage_of_judges, state_name)` and is sorted in descending order.
```
[(64.86, 'Mizoram'),
 (62.16, 'Meghalaya'),
 (58.79, 'Sikkim'),
 (50.62, 'Goa'),
 (50.0, 'Manipur'),
 (45.44, 'Gujarat'),
 (41.28, 'Punjab'),
 (39.55, 'Assam'),
 (39.33, 'Uttarakhand'),
 (36.43, 'Chandigarh'),
 (35.81, 'Tamil Nadu'),
 (35.5, 'Haryana'),
 (35.28, 'Delhi'),
 (35.11, 'Andhra Pradesh'),
 (34.54, 'Telangana'),
 (33.33, 'Chhattisgarh'),
 (33.15, 'Orissa'),
 (32.05, 'Karnataka'),
 (31.58, 'Rajasthan'),
 (30.33, 'West Bengal'),
 (28.34, 'Maharashtra'),
 (28.24, 'Kerala'),
 (28.09, 'Himachal Pradesh'),
 (28.08, 'Tripura'),
 (21.92, 'Madhya Pradesh'),
 (21.04, 'Uttar Pradesh'),
 (16.46, 'Jammu and Kashmir'),
 (9.99, 'Jharkhand'),
 (6.67, 'Bihar')]
 ```

In [10]:
states_df.groupby(by='state_code').max()

Unnamed: 0_level_0,year,state_name,pc11_state_name,pc11_state_id
state_code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,2018,Maharashtra,maharashtra,27
2,2018,Andhra Pradesh,andhra pradesh,28
3,2018,Karnataka,karnataka,29
4,2018,Kerala,kerala,32
5,2018,Himachal Pradesh,himachal pradesh,2
6,2018,Assam,assam,18
7,2018,Jharkhand,jharkhand,20
8,2018,Bihar,bihar,10
9,2018,Rajasthan,rajasthan,8
10,2018,Tamil Nadu,tamil nadu,33


In [11]:
state_mfratio = {}
for code, row in states_df.groupby(by='state_code').max().iterrows():
    name = row['state_name']

    filt = female_filt & (judges_df['state_code'] == code)
    nfemale = judges_df[filt].shape[0]

    filt = male_filt & (judges_df['state_code'] == code)
    nmale = judges_df[filt].shape[0]

    if nmale == 0 and nfemale == 0:
        continue

    state_mfratio[name] = {
        'm': nmale,
        'f': nfemale,
        'm_mf': round((nmale / (nmale + nfemale)) * 100, 2),
        'f_mf': round((nfemale / (nmale + nfemale)) * 100, 2),
    }

In [12]:
# now we sort by states with the highest female ratio
state_fratio = []
for state, stat in state_mfratio.items():
    state_fratio.append((stat['f_mf'], state))

sorted(state_fratio, reverse=True)

[(64.86, 'Mizoram'),
 (62.16, 'Meghalaya'),
 (58.79, 'Sikkim'),
 (50.62, 'Goa'),
 (50.0, 'Manipur'),
 (45.44, 'Gujarat'),
 (41.28, 'Punjab'),
 (39.55, 'Assam'),
 (39.33, 'Uttarakhand'),
 (36.43, 'Chandigarh'),
 (35.81, 'Tamil Nadu'),
 (35.5, 'Haryana'),
 (35.28, 'Delhi'),
 (35.11, 'Andhra Pradesh'),
 (34.54, 'Telangana'),
 (33.33, 'Chhattisgarh'),
 (33.15, 'Orissa'),
 (32.05, 'Karnataka'),
 (31.58, 'Rajasthan'),
 (30.33, 'West Bengal'),
 (28.34, 'Maharashtra'),
 (28.24, 'Kerala'),
 (28.09, 'Himachal Pradesh'),
 (28.08, 'Tripura'),
 (21.92, 'Madhya Pradesh'),
 (21.04, 'Uttar Pradesh'),
 (16.46, 'Jammu and Kashmir'),
 (9.99, 'Jharkhand'),
 (6.67, 'Bihar')]