## Want to explore different loan types

In [1]:
import os
import pandas as pd 
import glob

In [2]:
os.listdir()

['Well_Being.ipynb',
 'info.pdf',
 '.DS_Store',
 'different_loan_types_julia.ipynb',
 'Demo_Individual.ipynb',
 'House Ownership.ipynb',
 'diaries_transactions_all.csv',
 'consumption_julia.ipynb',
 'README.md',
 '.gitignore',
 'Exploration_Ella.ipynb',
 '.ipynb_checkpoints',
 'aux_data_exploration_julia.ipynb',
 '.git',
 'Exploration_Loan_Ella.ipynb',
 'initial_exploration_julia.ipynb',
 'aux_data']

In [3]:
trx = pd.read_csv('diaries_transactions_all.csv', low_memory=False)

In [4]:
pd.set_option('display.max_columns',200)

In [5]:
trx.shape

(483949, 58)

In [6]:
trx.columns

Index(['hh_ids', 'unique_hhs', 'first_trx_date_hh', 'last_trx_date_hh',
       'tot_hh_daysofobs', 'tot_hh_monthsofobs', 'interview_designation',
       'int_date', 'int_month', 'int_year', 'int_yr_mo', 'first_int_date',
       'account_ids', 'unique_accnts', 'm_ids_owner', 'unique_hm_owner',
       'account_bsheet_desig', 'account_startclose_balance', 'account_formal',
       'account_liquid', 'first_trx_date_acc', 'last_trx_date_acc',
       'tot_acc_daysofobs', 'tot_acc_monthsofobs', 'trx_id', 'm_ids_trx',
       'trx_date', 'trx_month', 'trx_year', 'trx_yr_mo', 'trx_dq_round',
       'trx_stdtime_days_hh', 'trx_stdtime_mnths_hh', 'trx_stdtime_days_acc',
       'trx_stdtime_mnths_acc', 'trx_class_code', 'trx_class_desc',
       'trx_family_code', 'trx_family_desc', 'trx_type_code', 'trx_type_desc',
       'trx_prx_purpose', 'trx_prx_purpose_fd', 'trx_fee',
       'trx_bsheet_direction', 'trx_mode_code', 'trx_mode_desc',
       'trx_place_incommunity', 'trx_distance_km', 'trx_outlet'

## Account questions: 

### How many accounts are there? 

In [7]:
len(trx.account_ids.unique())

9547

Is this the same as unique_accounts ? 

In [8]:
trx.unique_accnts.value_counts()

1.0    9546
Name: unique_accnts, dtype: int64

Yup. 

### Account types?

In [9]:
trx.account_bsheet_desig.value_counts()

Asset        52021
Liability    23078
Insurance     1971
Name: account_bsheet_desig, dtype: int64

### How many unique liabilities accounts, e.g. loans?  

In [10]:
trx.loc[trx['unique_accnts']==1].shape

(9546, 58)

In [11]:
trx.loc[trx['unique_accnts']==1].account_bsheet_desig.value_counts()

Asset        2828
Liability    1906
Insurance     254
Name: account_bsheet_desig, dtype: int64

1906 loans in total to work with. 

### How many of these accounts are formal? 

In [12]:
trx.loc[trx['unique_accnts']==1].loc[trx['account_bsheet_desig']=="Liability"].account_formal.value_counts()

Informal    1691
Formal       215
Name: account_formal, dtype: int64

### What kinds of transactions are happening with liabilities? 

In [13]:
trx.loc[trx['account_bsheet_desig']=="Liability"].trx_family_code.value_counts()

INFP2P     14755
SUPPCRD     4050
FRMLN       1429
INFGRP      1377
ARREARS     1058
ADVANCE      280
EMPLN        111
PAWN          16
OTHER          2
Name: trx_family_code, dtype: int64

In [14]:
trx.loc[trx['account_bsheet_desig']=="Liability"].trx_family_desc.value_counts()

Informal P2P                              14755
Supplier credit                            4050
Formal loan                                1429
Informal group                             1377
Arrears owed to or owed by respondents     1058
Advance                                     280
Loan from employer                          111
Pawning assets                               16
Other                                         2
Name: trx_family_desc, dtype: int64

### What transactions are in "informal" loan categories, and which are in the "formal" categories? 

In [15]:
trx.loc[trx['account_bsheet_desig']=="Liability"].loc[trx['account_formal']=="Formal"].trx_family_desc.value_counts()

Supplier credit    2973
Formal loan        1276
Other                 2
Name: trx_family_desc, dtype: int64

In [16]:
trx.loc[trx['account_bsheet_desig']=="Liability"].loc[trx['account_formal']=="Informal"].trx_family_desc.value_counts()

Informal P2P                              14755
Informal group                             1377
Supplier credit                            1077
Arrears owed to or owed by respondents     1058
Advance                                     280
Formal loan                                 153
Loan from employer                          111
Pawning assets                               16
Name: trx_family_desc, dtype: int64

### How many closing balances of a formal loan are recorded? 

In [17]:
trx.loc[trx['account_bsheet_desig']=="Liability"].loc[trx['account_formal']=="Formal"].account_startclose_balance.value_counts()


Close    201
Start     77
Name: account_startclose_balance, dtype: int64

### For those listed as a closing balance, how many are zero? 

In [18]:
trx.loc[trx['account_bsheet_desig']=="Liability"].loc[trx['account_formal']=="Formal"].loc[trx['account_startclose_balance']=='Close'].trx_value_kes.value_counts().head()


0.0       114
20.0        5
50.0        4
10.0        4
1075.0      2
Name: trx_value_kes, dtype: int64

### What is the average number of days these liability accounts are observed? 

All liabilities: 

In [19]:
trx.loc[trx['unique_accnts']==1].loc[trx['account_bsheet_desig']=="Liability"].tot_acc_daysofobs.mean()

221.24186778593915

Formal liabilities: 

In [20]:
trx.loc[trx['unique_accnts']==1].loc[trx['account_bsheet_desig']=="Liability"].loc[trx['account_formal']=='Formal'].tot_acc_daysofobs.mean()

236.05116279069767

## How many unique liabilities accounts are there? s