# Kiva Microcredit Platform - Analysis and Visualizations

Kiva.org is a non-profit organization that operates an online platform to facilitate micro-lending to individuals and small businesses mostly in the global south. The platform connects individual lenders with entrepreneurs and borrowers who need financial assistance to start or expand their businesses, access education, or improve their living conditions.

In a recent project I got to work with some data of this organization. It was quite interesting to dive a little deeper into this field of micro credits and look at the underlying dynamics.

The data is stored here: https://www.kaggle.com/datasets/tobiasdata123/kiva-org-data

## Loading libraries

In [61]:
import numpy as np
import pandas as pd 
import plotly.express as px 
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)

## Data Preparation



### Loading data

In [68]:
#Laden der CSV-Datei

df_kiva = pd.read_csv('kiva_data.csv',
                        sep='#',
                        index_col=0,
                        skipinitialspace=True)

#Testen nach dem Laden
df_kiva.head()

Unnamed: 0,funded_amount,loan_amount,activity,sector,use,country_code,country,region,currency,term_in_months,lender_count,borrower_genders,repayment_interval
0,300.0,300.0,Fruits & Vegetables,Food,"To buy seasonal, fresh fruits to sell.",PK,Pakistan,Lahore,PKR,12.0,12,female,irregular
1,575.0,575.0,Rickshaw,Transportation,to repair and maintain the auto rickshaw used ...,PK,Pakistan,Lahore,PKR,11.0,14,"female, female",irregular
2,150.0,150.0,Transportation,Transportation,To repair their old cycle-van and buy another ...,IN,India,Maynaguri,INR,43.0,6,female,bullet
3,200.0,200.0,Embroidery,Arts,to purchase an embroidery machine and a variet...,PK,Pakistan,Lahore,PKR,11.0,8,female,irregular
4,400.0,400.0,Milk Sales,Food,to purchase one buffalo.,PK,Pakistan,Abdul Hakeem,PKR,14.0,16,female,monthly


## EDA

In [None]:
# Looking at column names and data types

df_kiva.info()

<class 'pandas.core.frame.DataFrame'>
Index: 671205 entries, 0 to 671204
Data columns (total 13 columns):
 #   Column              Non-Null Count   Dtype  
---  ------              --------------   -----  
 0   funded_amount       671205 non-null  float64
 1   loan_amount         671205 non-null  float64
 2   activity            671205 non-null  object 
 3   sector              671205 non-null  object 
 4   use                 666972 non-null  object 
 5   country_code        671197 non-null  object 
 6   country             671205 non-null  object 
 7   region              614405 non-null  object 
 8   currency            671205 non-null  object 
 9   term_in_months      671205 non-null  float64
 10  lender_count        671205 non-null  int64  
 11  borrower_genders    666984 non-null  object 
 12  repayment_interval  671205 non-null  object 
dtypes: float64(3), int64(1), object(9)
memory usage: 71.7+ MB


## Finding duplicates

In [None]:
# Looking for duplicates

df_kiva.loc[df_kiva.duplicated(subset=['funded_amount', 'loan_amount', 'activity', 'sector', 'use', 'country_code', 'country', 'region', 'currency', 'term_in_months', 'lender_count', 'borrower_genders', 'repayment_interval'],keep=False),:]

Unnamed: 0,funded_amount,loan_amount,activity,sector,use,country_code,country,region,currency,term_in_months,lender_count,borrower_genders,repayment_interval
327,275.0,275.0,Farming,Agriculture,to buy fertilizers and other farm supplies.,PH,Philippines,"Brookes Point, Palawan",PHP,8.0,8,female,irregular
392,100.0,100.0,Home Energy,Personal Use,to buy a solar lamp.,SV,El Salvador,,USD,14.0,4,male,monthly
405,100.0,100.0,Home Energy,Personal Use,to buy a solar-powered lamp.,SV,El Salvador,,USD,14.0,4,male,monthly
498,100.0,100.0,Home Energy,Personal Use,to buy a solar-powered lamp.,SV,El Salvador,,USD,14.0,4,male,monthly
606,100.0,100.0,Home Energy,Personal Use,to buy a solar-powered lamp.,SV,El Salvador,,USD,14.0,4,male,monthly
...,...,...,...,...,...,...,...,...,...,...,...,...,...
671200,0.0,25.0,Livestock,Agriculture,"[True, u'para compara: cemento, arenya y ladri...",PY,Paraguay,Concepción,USD,13.0,0,female,monthly
671201,25.0,25.0,Livestock,Agriculture,"[True, u'to start a turducken farm.'] - this l...",KE,Kenya,,KES,13.0,1,female,monthly
671202,0.0,25.0,Games,Entertainment,,KE,Kenya,,KES,13.0,0,,monthly
671203,0.0,25.0,Livestock,Agriculture,"[True, u'to start a turducken farm.'] - this l...",KE,Kenya,,KES,13.0,0,female,monthly


## Missing values

In [None]:
#Show NaNs

df_kiva.loc[:,df_kiva.columns].isna().sum()

funded_amount             0
loan_amount               0
activity                  0
sector                    0
use                    4233
country_code              8
country                   0
region                56800
currency                  0
term_in_months            0
lender_count              0
borrower_genders       4221
repayment_interval        0
dtype: int64

In [None]:
# Percentage of NaNs for column

df_kiva.loc[:,df_kiva.columns].isna().mean().sort_values(ascending=False)

region                0.084624
use                   0.006307
borrower_genders      0.006289
country_code          0.000012
funded_amount         0.000000
loan_amount           0.000000
activity              0.000000
sector                0.000000
country               0.000000
currency              0.000000
term_in_months        0.000000
lender_count          0.000000
repayment_interval    0.000000
dtype: float64

In [None]:
# Removing column region
df_kiva.drop(columns = ['region'], inplace= True)

Das Auffüllen des Country-Codes scheint gut möglich zu sein.

In [None]:
# Identifying NaNs in country_code

df_kiva.loc[(df_kiva.loc[:,'country_code'].isnull())]

Unnamed: 0,funded_amount,loan_amount,activity,sector,use,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval
202537,4150.0,4150.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,6.0,162,female,bullet
202823,4150.0,4150.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,6.0,159,male,bullet
344929,3325.0,3325.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,7.0,120,female,bullet
351177,3325.0,3325.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,7.0,126,male,bullet
420953,3325.0,3325.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,7.0,118,female,bullet
421218,4000.0,4000.0,Wholesale,Wholesale,purchase solar lighting products for sale to l...,,Namibia,NAD,7.0,150,male,bullet
487207,5100.0,5100.0,Renewable Energy Products,Retail,to pay for stock of solar lights and cell phon...,,Namibia,NAD,7.0,183,male,bullet
487653,5000.0,5000.0,Wholesale,Wholesale,to maintain a stock of solar lights and cell p...,,Namibia,NAD,7.0,183,female,bullet


In [None]:
#Handling some naming confusion

df_kiva.loc[(df_kiva.loc[:,'country_code']=='NA')]
df_kiva.loc[(df_kiva.loc[:,'country']=='Namibia')]

# Replacing NaN with NA for Namibia

df_kiva.loc[(df_kiva.loc[:,'country_code'].isnull()), 'country_code'] = 'NA'
df_kiva.loc[(df_kiva.loc[:,'country']=='Namibia')]

Unnamed: 0,funded_amount,loan_amount,activity,sector,use,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval
202537,4150.0,4150.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,6.0,162,female,bullet
202823,4150.0,4150.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,6.0,159,male,bullet
344929,3325.0,3325.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,7.0,120,female,bullet
351177,3325.0,3325.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,7.0,126,male,bullet
420953,3325.0,3325.0,Wholesale,Wholesale,To purchase lighting products for sale to loca...,,Namibia,NAD,7.0,118,female,bullet
421218,4000.0,4000.0,Wholesale,Wholesale,purchase solar lighting products for sale to l...,,Namibia,NAD,7.0,150,male,bullet
487207,5100.0,5100.0,Renewable Energy Products,Retail,to pay for stock of solar lights and cell phon...,,Namibia,NAD,7.0,183,male,bullet
487653,5000.0,5000.0,Wholesale,Wholesale,to maintain a stock of solar lights and cell p...,,Namibia,NAD,7.0,183,female,bullet


In [None]:
#Erneutes Prüfen auf verbleibende NaNs
df_kiva.loc[:,df_kiva.columns].isna().sum()

funded_amount            0
loan_amount              0
activity                 0
sector                   0
use                   4233
country_code             0
country                  0
currency                 0
term_in_months           0
lender_count             0
borrower_genders      4221
repayment_interval       0
dtype: int64

Es verbleiben 4k NaN-Werte in den Spalten Use und Borrower_Genders. Wir prüfen auf Überschneidungen:

In [None]:
# CHecking NaN overlap in gender and use

df_kiva.loc[(df_kiva.loc[:,'use'].isnull() & df_kiva.loc[:,'borrower_genders'].isnull())]
df_kiva.loc[(df_kiva.loc[:,'use'].isnull() & df_kiva.loc[:,'borrower_genders'].isnull())].describe()


Unnamed: 0,funded_amount,loan_amount,term_in_months,lender_count
count,4221.0,4221.0,4221.0,4221.0
mean,923.028903,1173.655532,15.127458,26.918977
std,1454.483435,2319.878002,8.917652,40.103457
min,0.0,25.0,2.0,0.0
25%,250.0,325.0,9.0,7.0
50%,500.0,625.0,14.0,16.0
75%,1050.0,1200.0,18.0,33.0
max,50000.0,50000.0,133.0,1310.0


In [None]:
# Deleting remaining columns with NaNs

df_kiva.dropna(inplace=True)

In [None]:
# Dropping of use columns because it has no further use

df_kiva.drop(columns =['use'], inplace= True)

Mit diesem Schritt ist die Betrachtung von fehlenden Werten abgeschlossen. NaNs wurden entweder entfernt oder aufgefüllt. Es verbleibt ein Datensatz ohne Missing Values oder Platzhalter.

## Identifying extreme values and outliers

In [None]:
# Looking at min and max values

df_kiva.describe()

Unnamed: 0,funded_amount,loan_amount,term_in_months,lender_count
count,666972.0,666972.0,666972.0,666972.0
mean,785.131835,840.272905,13.73022,20.551025
std,1128.005848,1187.875622,8.59619,28.366363
min,0.0,25.0,1.0,0.0
25%,250.0,275.0,8.0,7.0
50%,450.0,500.0,13.0,13.0
75%,900.0,1000.0,14.0,24.0
max,100000.0,100000.0,158.0,2986.0


In [None]:
df_kiva.loc[(df_kiva.loc[:,'funded_amount']==100000)]

Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval
70499,100000.0,100000.0,Agriculture,Agriculture,HT,Haiti,USD,75.0,2986,female,irregular


In [None]:
df_kiva.loc[(df_kiva.loc[:,'term_in_months']==158)]

Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval
215863,6725.0,6725.0,Higher education costs,Education,DO,Dominican Republic,DOP,158.0,244,female,irregular


In [None]:
# Frequency analysis for sectors

df_agg = df_kiva.groupby("sector").size()
fig = px.bar(x=df_agg, y=df_agg.index, title = 'Overview sectors')
fig.update_layout(yaxis={'categoryorder':'total ascending'},xaxis_title = 'Number of projects', yaxis_title = 'Sectors')

In [None]:
# Frequency analysis for countries

df_agg = df_kiva.groupby("country").size()
fig = px.bar(x = df_agg, y=df_agg.index, title = 'Overview Countries')
fig.update_layout(yaxis={'categoryorder':'total ascending'}, xaxis_title = 'Number of projects', yaxis_title = 'Countries')

## Extracting additional information from columns and creating new KPIs

In [None]:
# Generating the column number of borrowers

df_kiva['number_of_borrowers'] = df_kiva.borrower_genders.apply(lambda x: len(str(x).split(' ')))
df_kiva


Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval,number_of_borrowers
0,300.0,300.0,Fruits & Vegetables,Food,PK,Pakistan,PKR,12.0,12,female,irregular,1
1,575.0,575.0,Rickshaw,Transportation,PK,Pakistan,PKR,11.0,14,"female, female",irregular,2
2,150.0,150.0,Transportation,Transportation,IN,India,INR,43.0,6,female,bullet,1
3,200.0,200.0,Embroidery,Arts,PK,Pakistan,PKR,11.0,8,female,irregular,1
4,400.0,400.0,Milk Sales,Food,PK,Pakistan,PKR,14.0,16,female,monthly,1
...,...,...,...,...,...,...,...,...,...,...,...,...
671199,0.0,25.0,Livestock,Agriculture,PY,Paraguay,USD,13.0,0,female,monthly,1
671200,0.0,25.0,Livestock,Agriculture,PY,Paraguay,USD,13.0,0,female,monthly,1
671201,25.0,25.0,Livestock,Agriculture,KE,Kenya,KES,13.0,1,female,monthly,1
671203,0.0,25.0,Livestock,Agriculture,KE,Kenya,KES,13.0,0,female,monthly,1


In [None]:
#Generating superordinate gender
df_kiva['gender_of_borrowers'] = df_kiva.borrower_genders.apply(lambda x: str(pd.Series(str(x).split(', ')).unique()))

In [None]:
#Editing new column further
df_kiva.loc[((df_kiva.loc[:,'gender_of_borrowers'] == "['female' 'male']") | (df_kiva.loc[:,'gender_of_borrowers'] == "['male' 'female']"), 'gender_of_borrowers')] = 'mixed'
df_kiva.loc[(df_kiva.loc[:,'gender_of_borrowers'] == "['female']"), 'gender_of_borrowers'] = 'female'
df_kiva.loc[(df_kiva.loc[:,'gender_of_borrowers'] == "['male']"), 'gender_of_borrowers'] = 'male'

In [None]:
#Fixing country code issues with new library

import pycountry_convert as pc

def country_to_continent(country_name):
    #In der Bibliothek gab es einen Bug mit einer Fehlermeldung zu TL als Country-Code, möglicherweise mehr
    try:
        country_alpha2 = pc.country_name_to_country_alpha2(country_name)
        country_continent_code = pc.country_alpha2_to_continent_code(country_alpha2)
        country_continent_name = pc.convert_continent_code_to_continent_name(country_continent_code)
    except: 
        country_continent_name = 'please check' 
    return country_continent_name


In [None]:
#Adding continent to countries
df_kiva['continent'] = df_kiva.country.apply(lambda x: country_to_continent(x))

In [None]:
#Checking for problems
df_kiva.loc[(df_kiva.loc[:,'continent']=='please check')]['country'].unique()

array(['Timor-Leste', 'Kosovo', 'The Democratic Republic of the Congo',
       'Virgin Islands', 'Myanmar (Burma)', "Cote D'Ivoire"], dtype=object)

In [None]:
# Manually fixing problems

df_kiva.loc[((df_kiva.loc[:,'country'] == 'The Democratic Republic of the Congo') | (df_kiva.loc[:,'country'] == "Cote D'Ivoire"), 'continent')] = 'Africa'
df_kiva.loc[(df_kiva.loc[:,'country'] == "Kosovo"), 'continent'] = 'Europe'
df_kiva.loc[((df_kiva.loc[:,'country'] == 'Timor-Leste') | (df_kiva.loc[:,'country'] == "Myanmar (Burma)"), 'continent')] = 'Asia'
df_kiva.loc[(df_kiva.loc[:,'country'] == 'Virgin Islands'), 'continent'] = 'North America'
             


## Adding more KPIs


In [None]:
# percentage_of_credit_goal_funded = Funded_Amount / Loan_Amount * 100
df_kiva['percentage_of_credit_goal_funded'] = df_kiva.funded_amount / df_kiva.loan_amount * 100


In [None]:
# mean_investment_given_per_lender = Funded_Amount / Lender_Count
df_kiva['mean_investment_given_per_lender'] = df_kiva.funded_amount / df_kiva.lender_count

# Filling in 0 for NaNs
df_kiva.loc[(df_kiva.loc[:,'mean_investment_given_per_lender'].isnull()), 'mean_investment_given_per_lender'] = 0


In [None]:
# ean_investment_received_per_borrower = Funded_Amount / Number_of_Borrowers
df_kiva['mean_investment_received_per_borrower'] = df_kiva.funded_amount / df_kiva.number_of_borrowers

In [None]:
# mean_credit_goal_per_borrower = Loan_Amount / Number_of_Borrowers
df_kiva['mean_credit_goal_per_borrower'] = df_kiva.loan_amount / df_kiva.number_of_borrowers

In [None]:
# mean_total_credit_rates_per_month = Funded_Amount / Term in month
df_kiva['mean_total_credit_rates_per_month'] = df_kiva.funded_amount / df_kiva.term_in_months

In [None]:
# mean_total_credit_rates_per_month_per_person = Funded_Amount / Term in month / Number_of_Borrowers
df_kiva['mean_total_credit_rates_per_month_per_person'] = df_kiva.funded_amount / df_kiva.term_in_months / df_kiva.number_of_borrowers

### Kategorization of data

In [None]:
# Frequency analysis for groupsize

df_number_of_borrowers = df_kiva.groupby("number_of_borrowers").size()
fig = px.bar(y=df_number_of_borrowers, log_y= True)
fig.update_layout(xaxis_title = 'Number of borrowers' , yaxis_title = 'Number of projects')
fig.add_vline(0.5)
fig.add_vline(4.5)
fig.add_vline(9.5)
fig.show()

In [None]:
# Finding labels for new categories
category = ["individual person", "small group (2-5)", "medium group (6-10)" , "large group (10+)"]

# Putting ranges

group_range = [0, 1, 5, 10, 100]
Applying new categoriesVergabe der Kategorien 
df_kiva["group_size"] = pd.cut(x=df_kiva.loc[:,"number_of_borrowers"], 
                                      bins=group_range,        
                                      labels=category
                                     )


Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,...,number_of_borrowers,gender_of_borrowers,continent,percentage_of_credit_goal_funded,mean_investment_given_per_lender,mean_investment_received_per_borrower,mean_credit_goal_per_borrower,mean_total_credit_rates_per_month,mean_total_credit_rates_per_month_per_person,group_size
0,300.0,300.0,Fruits & Vegetables,Food,PK,Pakistan,PKR,12.0,12,female,...,1,female,Asia,100.0,25.000000,300.0,300.0,25.000000,25.000000,individual person
1,575.0,575.0,Rickshaw,Transportation,PK,Pakistan,PKR,11.0,14,"female, female",...,2,female,Asia,100.0,41.071429,287.5,287.5,52.272727,26.136364,small group (2-5)
2,150.0,150.0,Transportation,Transportation,IN,India,INR,43.0,6,female,...,1,female,Asia,100.0,25.000000,150.0,150.0,3.488372,3.488372,individual person
3,200.0,200.0,Embroidery,Arts,PK,Pakistan,PKR,11.0,8,female,...,1,female,Asia,100.0,25.000000,200.0,200.0,18.181818,18.181818,individual person
4,400.0,400.0,Milk Sales,Food,PK,Pakistan,PKR,14.0,16,female,...,1,female,Asia,100.0,25.000000,400.0,400.0,28.571429,28.571429,individual person
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
671199,0.0,25.0,Livestock,Agriculture,PY,Paraguay,USD,13.0,0,female,...,1,female,South America,0.0,0.000000,0.0,25.0,0.000000,0.000000,individual person
671200,0.0,25.0,Livestock,Agriculture,PY,Paraguay,USD,13.0,0,female,...,1,female,South America,0.0,0.000000,0.0,25.0,0.000000,0.000000,individual person
671201,25.0,25.0,Livestock,Agriculture,KE,Kenya,KES,13.0,1,female,...,1,female,Africa,100.0,25.000000,25.0,25.0,1.923077,1.923077,individual person
671203,0.0,25.0,Livestock,Agriculture,KE,Kenya,KES,13.0,0,female,...,1,female,Africa,0.0,0.000000,0.0,25.0,0.000000,0.000000,individual person


## Optimizing memory use

In [None]:
# showing memory usage

df_kiva.memory_usage(deep=True) 

Index                                            5335776
funded_amount                                    5335776
loan_amount                                      5335776
activity                                        46227986
sector                                          43329295
country_code                                    39351348
country                                         43648891
currency                                        40018320
term_in_months                                   5335776
lender_count                                     5335776
borrower_genders                                46902998
repayment_interval                              43129118
number_of_borrowers                              5335776
gender_of_borrowers                             41701818
continent                                       42503276
percentage_of_credit_goal_funded                 5335776
mean_investment_given_per_lender                 5335776
mean_investment_received_per_bo

In [None]:
# Conversion of data types to save memory

df_kiva_reduced = df_kiva

df_kiva_reduced = df_kiva_reduced.astype({
    'funded_amount':'float32',
    'loan_amount':'float32',
    'activity':'category',
    'sector':'category',
    'country_code':'category',
    'country':'category',
    'currency':'category',
    'term_in_months':'int16',
    'lender_count':'int16',
    'borrower_genders':'category',
    'repayment_interval':'category',
    'number_of_borrowers':'int16',
    'gender_of_borrowers':'category',
    'continent':'category',
    'percentage_of_credit_goal_funded':'float32',
    'mean_investment_given_per_lender':'float32',
    'mean_investment_received_per_borrower':'float32',
    'mean_credit_goal_per_borrower':'float32',
    'mean_total_credit_rates_per_month':'float32',
    'mean_total_credit_rates_per_month_per_person':'float32',
    'group_size':'category'
})
df_kiva_reduced.info()

<class 'pandas.core.frame.DataFrame'>
Index: 666972 entries, 0 to 671204
Data columns (total 21 columns):
 #   Column                                        Non-Null Count   Dtype   
---  ------                                        --------------   -----   
 0   funded_amount                                 666972 non-null  float32 
 1   loan_amount                                   666972 non-null  float32 
 2   activity                                      666972 non-null  category
 3   sector                                        666972 non-null  category
 4   country_code                                  666972 non-null  category
 5   country                                       666972 non-null  category
 6   currency                                      666972 non-null  category
 7   term_in_months                                666972 non-null  int16   
 8   lender_count                                  666972 non-null  int16   
 9   borrower_genders                          

In [None]:
# Optimizing memory usage

reduction = (df_kiva.memory_usage(deep=True).sum()-df_kiva_reduced.memory_usage(deep=True).sum())/df_kiva.memory_usage(deep=True).sum()

print(reduction)

0.9093693854762896


Durch die Umwandlungen konnten ~80 Prozent Memory Usage eingespart werden. 

# EDA

## Business question

**Business Logic**

Kiva is a platform that facilitates crowd-investment through micro-loans. It connects money givers and money takers around the world. 

The company generates its own profits by receiving commission for each project that lands on the platform.

**Data analysis requirement**

The company wants to expand its business. Out of consideration, this can be done in many ways. 

It can:
- The number of individual projects can be increased, as more agreements mean more commission opportunities.
- The volume of individual projects can be increased, as larger, converted sums mean more commission.
- Entirely new business models could be thought of that expand the company beyond its current core business.

To make this possible, various steps are possible:
- Playing on regional markets that have not yet been tapped.
- The same applies to (new) financed sectors and activities.
- Focusing on specific sectors, activities or countries that either transform weaknesses into strengths or expand strengths already present today are also conceivable.

However, in order to arrive at such paths, which can be equated with a target state, the current state must first be mapped.

## Current state of kiva.org

In [None]:
df_kiva_reduced.describe()

Unnamed: 0,funded_amount,loan_amount,term_in_months,lender_count,number_of_borrowers,percentage_of_credit_goal_funded,mean_investment_given_per_lender,mean_investment_received_per_borrower,mean_credit_goal_per_borrower,mean_total_credit_rates_per_month,mean_total_credit_rates_per_month_per_person
count,666972.0,666972.0,666972.0,666972.0,666972.0,666972.0,666972.0,666972.0,666972.0,666972.0,666972.0
mean,785.131897,840.272827,13.73022,20.551025,2.018325,96.113731,61.017693,558.349426,604.222656,70.384796,42.666138
std,1128.005859,1187.87561,8.59619,28.366363,3.413485,15.729571,178.399033,830.445496,901.504272,125.827049,55.604282
min,0.0,25.0,1.0,0.0,1.0,0.0,0.0,0.0,2.777778,0.0,0.0
25%,250.0,275.0,8.0,7.0,1.0,100.0,26.666666,225.0,225.0,22.727272,19.791666
50%,450.0,500.0,13.0,13.0,1.0,100.0,31.25,375.0,400.0,37.5,32.142857
75%,900.0,1000.0,14.0,24.0,1.0,100.0,42.857143,650.0,700.0,65.625,51.388889
max,100000.0,100000.0,158.0,2986.0,50.0,113.333336,9475.0,100000.0,100000.0,10000.0,10000.0


A look at the adjusted data shows us that Kiva performed about 670000 transactions for us. A median transaction within this sample looked something like this:

Two borrowers want 500 USD and want to pay it back within 14 months. Typically, this sum is funded by 24 investors at about 90 % (450 USD funded_amount).

So we see that smaller sums for a few people have been the core business of Kiva so far. To see where and how these loans come about, we can look at distributions by geography, sector, gender and group size:

### Kiva around the World

In [None]:
# Creating grouped table
df_overview_continents = df_kiva_reduced.groupby(by='continent', as_index=False).agg(func={'funded_amount': ['count', 'sum', 'mean'], 'loan_amount': ['count', 'sum', 'mean']})
df_overview_continents

Unnamed: 0_level_0,continent,funded_amount,funded_amount,funded_amount,loan_amount,loan_amount,loan_amount
Unnamed: 0_level_1,Unnamed: 1_level_1,count,sum,mean,count,sum,mean
0,Africa,171423,126065344.0,735.40509,171423,133277504.0,777.477356
1,Asia,322761,190881504.0,591.402039,322761,201540704.0,624.427063
2,Europe,4624,6465425.0,1398.232056,4624,6893175.0,1490.738525
3,North America,82306,88570568.0,1076.113159,82306,101945048.0,1238.610107
4,Oceania,7931,6116920.0,771.267151,7931,6783325.0,855.292542
5,South America,77927,105561200.0,1354.616455,77927,109998752.0,1411.561523


A first look at the global values shows us that Kiva has so far been by far the most active in Asia when we look at pure loans granted. Interestingly, this large gap closes when we look at total investments, both requested and actually delivered. Africa and South America are catching up in terms of total amounts. However, the average size of projects is largest in Europe and South America, whereas Asia and Africa are in the last two ranks.

### Kiva in various sectors

In [None]:
# Creating grouped table
df_overview_sectors = df_kiva_reduced.groupby(by='sector' , as_index=False).agg(func={'funded_amount': ['count', 'sum', 'mean'], 'loan_amount': ['count', 'sum', 'mean']})
df_overview_sectors

Unnamed: 0_level_0,sector,funded_amount,funded_amount,funded_amount,loan_amount,loan_amount,loan_amount
Unnamed: 0_level_1,Unnamed: 1_level_1,count,sum,mean,count,sum,mean
0,Agriculture,179221,133041272.0,742.330811,179221,142107056.0,792.915222
1,Arts,11994,11717555.0,976.951416,11994,12107800.0,1009.488098
2,Clothing,32478,34534520.0,1063.320435,32478,36976400.0,1138.506104
3,Construction,6194,6174600.0,996.86792,6194,6531950.0,1054.560913
4,Education,30837,29866944.0,968.54248,30837,30773050.0,997.926208
5,Entertainment,820,1000645.0,1220.298828,820,1366650.0,1666.646362
6,Food,135746,114191352.0,841.213379,135746,120509728.0,887.758972
7,Health,9172,9119750.0,994.303345,9172,9788100.0,1067.171875
8,Housing,33571,21253450.0,633.0896,33571,23513224.0,700.402832
9,Manufacturing,6158,5298525.0,860.429504,6158,5385475.0,874.549377


### Kiva and Gender

In [None]:
# Creating grouped table
df_overview_gender = df_kiva_reduced.groupby(by='gender_of_borrowers' , as_index=False).agg(func={'funded_amount': ['count', 'sum', 'mean'], 'loan_amount': ['count', 'sum', 'mean']})
df_overview_gender

Unnamed: 0_level_0,gender_of_borrowers,funded_amount,funded_amount,funded_amount,loan_amount,loan_amount,loan_amount
Unnamed: 0_level_1,Unnamed: 1_level_1,count,sum,mean,count,sum,mean
0,female,488074,339860544.0,696.329956,488074,356517184.0,730.457214
1,male,138520,110445488.0,797.325195,138520,126916728.0,916.233948
2,mixed,40378,73354928.0,1816.705322,40378,77004576.0,1907.092407


The first, superficial look at gender and Kiva shows that many more women's (groups) than men's (groups) have been supported in the company's history so far. Men and women together in groups applying for loans is the rarest. As a result, the absolute amounts have also been clearly distributed so far. Women (groups) have been lent more money than men (groups) or mixed groups. Interestingly, however, this picture is reversed if we take a look at the average size of the respective loans. Here, women-only groups apply for and receive less than men's or mixed groups.

In [None]:
# Creating grouped table
df_overview_gender_1 = df_kiva_reduced.groupby(by='gender_of_borrowers' , as_index=False).agg(func={'mean_investment_received_per_borrower': ['count', 'mean'], })
df_overview_gender_1

Unnamed: 0_level_0,gender_of_borrowers,mean_investment_received_per_borrower,mean_investment_received_per_borrower
Unnamed: 0_level_1,Unnamed: 1_level_1,count,mean
0,female,488074,525.588745
1,male,138520,770.284424
2,mixed,40378,227.287292


A closer look at the amounts that actually reach people shows that women alone or in women-only groups receive significantly less money on average than men alone or in groups. Even less, however, is received per capita in mixed groups.

### Kiva and group size

In [None]:
# Creating gropued table
df_overview_groupsize = df_kiva_reduced.groupby(by='group_size' , as_index=False).agg(func={'funded_amount': ['count', 'sum', 'mean'], 'loan_amount': ['count', 'sum', 'mean']})
df_overview_groupsize

Unnamed: 0_level_0,group_size,funded_amount,funded_amount,funded_amount,loan_amount,loan_amount,loan_amount
Unnamed: 0_level_1,Unnamed: 1_level_1,count,sum,mean,count,sum,mean
0,individual person,561203,344880384.0,614.537659,561203,374039296.0,666.495544
1,small group (2-5),59444,54793376.0,921.764648,59444,57973924.0,975.269592
2,medium group (6-10),23264,53183848.0,2286.10083,23264,55950476.0,2405.023926
3,large group (10+),23061,70803328.0,3070.262695,23061,72474800.0,3142.743164


A first overview of group sizes shows that the platform was historically clearly designed for individuals and probably still is. As the group size increases, the number of projects decreases. The fact that large groups have a similar number of loans as medium groups is only due to the fact that the number of people behind them ranges from 11 to 50.

It is nevertheless exciting to see that large groups have nevertheless received more investment money than small and medium groups.

In [None]:
# Creating grouped table
df_overview_groupsize_1 = df_kiva_reduced.groupby(by='group_size' , as_index=False).agg(func={'mean_investment_received_per_borrower': ['count', 'mean'], })
df_overview_groupsize_1

Unnamed: 0_level_0,group_size,mean_investment_received_per_borrower,mean_investment_received_per_borrower
Unnamed: 0_level_1,Unnamed: 1_level_1,count,mean
0,individual person,561203,614.537659
1,small group (2-5),59444,278.021088
2,medium group (6-10),23264,291.99823
3,large group (10+),23061,182.267029


However, if we break the investment sums down again to money received per capita, we find that the individual receives less and less as the group size increases, with a small countervailing result for medium groups, which receive slightly more money per person than small groups.

## Analysis and plots

### World map for Kiva activities

For an analysis, it makes sense to first get an overview of Kiva's global business. A world map seems appropriate:

In [None]:
#Grouping data

df_scatter_country = df_kiva_reduced.groupby(['country', 'country_code'], observed= True, as_index= False).agg(funded_amount_count = ('funded_amount', 'count'),
                                                                               funded_amount_sum = ('funded_amount', 'sum'),
                                                                               funded_amount_mean = ('funded_amount', 'mean'))

# Adding continent
df_scatter_country['continent'] = df_scatter_country.country.apply(lambda x: country_to_continent(x))
df_scatter_country.loc[((df_scatter_country.loc[:,'country'] == 'The Democratic Republic of the Congo') | (df_scatter_country.loc[:,'country'] == "Cote D'Ivoire"), 'continent')] = 'Africa'
df_scatter_country.loc[(df_scatter_country.loc[:,'country'] == "Kosovo"), 'continent'] = 'Europe'
df_scatter_country.loc[((df_scatter_country.loc[:,'country'] == 'Timor-Leste') | (df_scatter_country.loc[:,'country'] == "Myanmar (Burma)"), 'continent')] = 'Asia'
df_scatter_country.loc[(df_scatter_country.loc[:,'country'] == 'Virgin Islands'), 'continent'] = 'North America'

# Project volume
df_scatter_country['country_type'] = 'Below 10 Mio USD'
df_scatter_country.loc[((df_scatter_country.loc[:,'funded_amount_sum'] > 10000000) & (df_scatter_country.loc[:,'funded_amount_count'] > 10000), 'country_type')] = 'Above 10 Mio USD, High Volume'
df_scatter_country.loc[((df_scatter_country.loc[:,'funded_amount_mean'] > 1000) & (df_scatter_country.loc[:,'funded_amount_sum'] > 10000000), 'country_type')] = 'Above 10 Mio USD, High Mean'
df_scatter_country.loc[((df_scatter_country.loc[:,'funded_amount_mean'] > 1000) & (df_scatter_country.loc[:,'funded_amount_count'] > 10000) & (df_scatter_country.loc[:,'funded_amount_sum'] > 10000000), 'country_type')] = 'Above 10 Mio USD, High Volume and Mean'


# Credits threshold to see regular business
df_scatter_country = df_scatter_country.loc[(df_scatter_country.loc[:,'funded_amount_count']>100)]
df_scatter_country

Unnamed: 0,country,country_code,funded_amount_count,funded_amount_sum,funded_amount_mean,continent,country_type
1,Albania,AL,1917,2469575.0,1288.249878,Europe,Below 10 Mio USD
2,Armenia,AM,8628,11186225.0,1296.502686,Asia,"Above 10 Mio USD, High Mean"
3,Azerbaijan,AZ,1905,2653225.0,1392.769043,Asia,Below 10 Mio USD
4,Belize,BZ,123,111975.0,910.365845,North America,Below 10 Mio USD
5,Benin,BJ,497,516825.0,1039.889282,Africa,Below 10 Mio USD
...,...,...,...,...,...,...,...
80,United States,US,5928,22676016.0,3825.238770,North America,"Above 10 Mio USD, High Mean"
82,Vietnam,VN,10841,13660925.0,1260.116699,Asia,"Above 10 Mio USD, High Volume and Mean"
84,Yemen,YE,2308,1771950.0,767.742615,Asia,Below 10 Mio USD
85,Zambia,ZM,750,1119250.0,1492.333374,Africa,Below 10 Mio USD


In [None]:
# Creating choropleth plot

fig = px.choropleth(df_scatter_country, 
                     locations= 'country',
                     locationmode= 'country names',
                     projection= 'natural earth', 
                     color= 'funded_amount_sum', 
                     hover_name = 'country',
                     hover_data = ['funded_amount_count', 'funded_amount_sum', 'funded_amount_mean'],
                     labels= {'country':'Country', 
                              'funded_amount_sum':'Total investment sum per country in USD',
                              'funded_amount_count':'Number of funded projects',
                              'funded_amount_mean':'Mean investment sum per project in USD'})

fig.update_layout(
    title_text='<b>World Map of Kiva Activity</b><br>(total investment sum per country in USD)',
    
)


fig.show()

This gives a good overview of all the activities that Kiva has carried out so far. It can be seen that the company has been very active on the African, Asian, North and South American continents. Centres for investments stand out in lighter colours. The Philippines, Kenya and Peru, for example, are worth mentioning here. There is little activity on the European continent. The same applies to Australia and Oceania. This gives a good overview of the dynamics in the past and the current state.

However, this overview could also be used to define a desirable target state. There are many white spots that could be potential markets for business activities, if desired. For example, the centre, the north and the southwest of the African continent should be mentioned. But also Central Asia does not seem to be fully developed yet.

### Scatter plot for rough country clustering

In [None]:
#  Create scatter plot

fig = px.scatter(df_scatter_country, 
                     x= 'funded_amount_mean',
                     y = 'funded_amount_count', 
                     title = '<b>Overview funding per country with 100+ projects</b>',
                     color= 'country_type',
                     color_discrete_map= {'Below 10 Mio USD' : 'grey', 'Above 10 Mio USD, High Volume1':'lightblue', 'Above 10 Mio USD, High Mean2':'darkorange', 'Above 10 Mio USD, High Volume and Mean':'lightgreen'},
                     labels={'country_type': 'Country type','funded_amount_count':"Number of Projects", "funded_amount_sum":"Total sum of investments in country", 'funded_amount_mean': 'Mean size of investment in country', 'continent':'Continent'},
                     size = 'funded_amount_sum',
                     size_max= 40,
                     hover_name = 'country',
                     )
fig.show()

In terms of the business model, it is advantageous for Kiva if both the number of projects in a country increases and the average investment amounts increase. A movement of countries in a positive direction on the axes corresponds to advantages for the business and is also reflected in the total amounts that are implemented via loans in a country. Here in the graphic the size of the plotted points is also shown.

It is noticeable that there are some countries on the left that have achieved a large volume of credit issued to date through the sheer number of projects. These include Kenya, Uganda, El Salvador, the Philippines and Cambodia. Colored here in light blue. These are all countries where at least USD 10 million has flowed and where over 10,000 projects have been implemented.

Another group of countries can be seen at the bottom towards the right. These are nations that attract larger investment offerings, but do not necessarily achieve their volume through a mass of projects. These include countries such as the Democratic Republic of Congo, Rwanda, Senegal, but also the USA. These countries generate huge revenues from the large average investments they receive. Colored here in orange. These are all countries where at least 10 million USD flowed and in which an average of over 1000 USD per project was implemented.

It would be desirable to combine both dynamics. South American countries in particular seem to be able to achieve this in the beginning. Ecuador, Paraguay and also Peru are in a quadrant in this graphic where not many other nations are represented. Projects there seem to generate interest in investing in large numbers but also with larger sums. Of the other continents, only Vietnam joins them. These four countries are the only ones that have so far managed to grant over 10,000 loans and receive an average amount of over 1,000 USD for the projects.

All nations colored gray do not meet the requirement of USD 10 million in sales.

In [None]:
#Check for all nations over 10 million USD
df_scatter_country.loc[((df_scatter_country.loc[:,'funded_amount_sum'] > 10000000))]

Unnamed: 0,country,country_code,funded_amount_count,funded_amount_sum,funded_amount_mean,continent,country_type
2,Armenia,AM,8628,11186225.0,1296.502686,Asia,"Above 10 Mio USD, High Mean"
7,Bolivia,BO,8732,18210724.0,2085.515869,South America,"Above 10 Mio USD, High Mean"
11,Cambodia,KH,34809,18784824.0,539.654236,Asia,"Above 10 Mio USD, High Volume"
15,Colombia,CO,21217,11968075.0,564.079529,South America,"Above 10 Mio USD, High Volume"
20,Ecuador,EC,13487,14568600.0,1080.195801,South America,"Above 10 Mio USD, High Volume and Mean"
22,El Salvador,SV,39320,23059976.0,586.46936,North America,"Above 10 Mio USD, High Volume"
26,Guatemala,GT,7234,10857175.0,1500.853638,North America,"Above 10 Mio USD, High Mean"
34,Kenya,KE,75112,31969964.0,425.630585,Africa,"Above 10 Mio USD, High Volume"
38,Lebanon,LB,8776,11539125.0,1314.85022,Asia,"Above 10 Mio USD, High Mean"
54,Pakistan,PK,26853,12465425.0,464.209778,Asia,"Above 10 Mio USD, High Volume"


In [None]:
#Check for > 10 million USD, > 1000 USD per project and > 10000 projects
df_scatter_country.loc[((df_scatter_country.loc[:,'funded_amount_mean'] > 1000) & (df_scatter_country.loc[:,'funded_amount_count'] > 10000))]

Unnamed: 0,country,country_code,funded_amount_count,funded_amount_sum,funded_amount_mean,continent,country_type
20,Ecuador,EC,13487,14568600.0,1080.195801,South America,"Above 10 Mio USD, High Volume and Mean"
57,Paraguay,PY,11841,29297650.0,2474.254639,South America,"Above 10 Mio USD, High Volume and Mean"
58,Peru,PE,22179,30302100.0,1366.251831,South America,"Above 10 Mio USD, High Volume and Mean"
82,Vietnam,VN,10841,13660925.0,1260.116699,Asia,"Above 10 Mio USD, High Volume and Mean"


### Barcharts for profile comparison among countries

In [None]:
#Transferring the country type to be further investigated into the main data set
df_kiva_reduced['country_type'] = 'Regular'
df_kiva_reduced.loc[((df_kiva_reduced.loc[:,'country'] == 'Vietnam') | (df_kiva_reduced.loc[:,'country'] == 'Ecuador') | (df_kiva_reduced.loc[:,'country'] == 'Peru') | (df_kiva_reduced.loc[:,'country'] == 'Paraguay'), 'country_type')] = 'Above 10 Mio USD, High Volume and Mean'
df_kiva_reduced

Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,...,gender_of_borrowers,continent,percentage_of_credit_goal_funded,mean_investment_given_per_lender,mean_investment_received_per_borrower,mean_credit_goal_per_borrower,mean_total_credit_rates_per_month,mean_total_credit_rates_per_month_per_person,group_size,country_type
0,300.0,300.0,Fruits & Vegetables,Food,PK,Pakistan,PKR,12,12,female,...,female,Asia,100.0,25.00000,300.0,300.0,25.000000,25.000000,individual person,Regular
1,575.0,575.0,Rickshaw,Transportation,PK,Pakistan,PKR,11,14,"female, female",...,female,Asia,100.0,41.07143,287.5,287.5,52.272728,26.136364,small group (2-5),Regular
2,150.0,150.0,Transportation,Transportation,IN,India,INR,43,6,female,...,female,Asia,100.0,25.00000,150.0,150.0,3.488372,3.488372,individual person,Regular
3,200.0,200.0,Embroidery,Arts,PK,Pakistan,PKR,11,8,female,...,female,Asia,100.0,25.00000,200.0,200.0,18.181818,18.181818,individual person,Regular
4,400.0,400.0,Milk Sales,Food,PK,Pakistan,PKR,14,16,female,...,female,Asia,100.0,25.00000,400.0,400.0,28.571428,28.571428,individual person,Regular
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
671199,0.0,25.0,Livestock,Agriculture,PY,Paraguay,USD,13,0,female,...,female,South America,0.0,0.00000,0.0,25.0,0.000000,0.000000,individual person,"Above 10 Mio USD, High Volume and Mean"
671200,0.0,25.0,Livestock,Agriculture,PY,Paraguay,USD,13,0,female,...,female,South America,0.0,0.00000,0.0,25.0,0.000000,0.000000,individual person,"Above 10 Mio USD, High Volume and Mean"
671201,25.0,25.0,Livestock,Agriculture,KE,Kenya,KES,13,1,female,...,female,Africa,100.0,25.00000,25.0,25.0,1.923077,1.923077,individual person,Regular
671203,0.0,25.0,Livestock,Agriculture,KE,Kenya,KES,13,0,female,...,female,Africa,0.0,0.00000,0.0,25.0,0.000000,0.000000,individual person,Regular


To consider business expansion, it might be worth taking a closer look at these four nations.

In [None]:
#Creating group by
df_bar_country_type = df_kiva_reduced.groupby(['country_type', 'sector' ], as_index= False).agg(funded_amount_count = ('funded_amount', 'count'),
                                                                                                funded_amount_sum = ('funded_amount', 'sum'),
                                                                                                funded_amount_mean = ('funded_amount', 'mean'))


# Adding percentages for better comparision
df_bar_country_type['percentage_of_projects_by_country_type'] = 0
df_bar_country_type.loc[((df_bar_country_type.loc[:,'country_type'] == 'Above 10 Mio USD, High Volume and Mean'),'percentage_of_projects_by_country_type')] = df_bar_country_type['funded_amount_count'] / 58348 *100
df_bar_country_type.loc[((df_bar_country_type.loc[:,'country_type'] == 'Regular'),'percentage_of_projects_by_country_type')] = df_bar_country_type['funded_amount_count'] / 608624 *100

df_bar_country_type['percentage_of_total_sums_by_country_type'] = 0
df_bar_country_type.loc[((df_bar_country_type.loc[:,'country_type'] == 'Above 10 Mio USD, High Volume and Mean'),'percentage_of_total_sums_by_country_type')] = df_bar_country_type['funded_amount_sum'] / 87829274 *100
df_bar_country_type.loc[((df_bar_country_type.loc[:,'country_type'] == 'Regular'),'percentage_of_total_sums_by_country_type')] = df_bar_country_type['funded_amount_sum'] / 435831667 *100


In [None]:
# Creating bar plot

fig = px.bar(df_bar_country_type, 
             x="sector", 
             y="percentage_of_projects_by_country_type",
             color='country_type',
             barmode='group', 
             title= '<b>Percentage of total projects by sector</b>',
             labels={'sector':'Sector', 'country_type':'Country Type', "percentage_of_projects_by_country_type" : 'Percentage of projects by sector'},
             hover_name= 'country_type')

fig.add_annotation(x=4, y=11,
            text="6% point gap in education",
            showarrow=True,
            arrowhead=1,
            bordercolor='black',
            bgcolor='white')

fig.add_annotation(x=8, y=7.5,
            text="2.5% point gap in housing",
            showarrow=True,
            arrowhead=1,
                  bordercolor='black',
            bgcolor='white')

fig.add_annotation(x=11, y=19,
            text="4% point gap in retail",
            showarrow=True,
            arrowhead=1,
                  bordercolor='black',
            bgcolor='white')


fig.show()

Looking at the graphic, the percentage shares of projects appear to be distributed very similarly across the individual sectors. Major differences in the sheer number of projects and their distribution can only be seen in three places. Our 'successful' countries have many more projects to report in the areas of education and housing. But they offer fewer retail projects.

At this point it might be worth looking at whether more educational projects and clothing loans could lead to higher sales in other countries.

In order to make more precise statements, it is not enough to just look at the number of projects and their distribution. The distribution of investments received is also interesting:

In [None]:
#Creating bar plot

fig = px.bar(df_bar_country_type, x="sector", y="percentage_of_total_sums_by_country_type",
             color='country_type', barmode='group', title= '<b>Percentage of total sum by sector</b>',
            labels={'sector':'Sector', 'country_type':'Country Type', "percentage_of_total_sums_by_country_type" : 'Percentage of total sum by sector'},
           hover_data= ['percentage_of_total_sums_by_country_type'])

fig.add_annotation(x=0, y=26,
            text="5% point in Agriculture",
            showarrow=True,
            arrowhead=1,
            bordercolor='black',
            bgcolor='white')

fig.add_annotation(x=2, y=8.5,
            text="2% point gap in clothing",
            showarrow=True,
            arrowhead=1,
                  bordercolor='black',
            bgcolor='white')

fig.add_annotation(x=6, y=26.5,
            text="5% point gap in food",
            showarrow=True,
            arrowhead=1,
                  bordercolor='black',
            bgcolor='white')

fig.add_annotation(x=10, y=3.5,
            text="2% point gap in personal use",
            showarrow=True,
            arrowhead=1,
                  bordercolor='black',
            bgcolor='white')

fig.show()

Investments received across the individual sectors are not distributed completely differently. For example, wholesale, arts or entertainment were not money-making sectors for countries in either category. Larger differences (difference of 2 percentage points or more) can be found in four places. The 'regular' countries take much more of their investment into agriculture. And they also record more in personal use. Conversely, clothing and food projects generate higher investment volumes for countries that were previously identified as best practice markets based on purely economic criteria.

**Summary**

The analysis can finally be summarized in these summary points:

- The world map has shown that there are still many expansion markets in which Kiva could become active in order to develop new regions.
- The scatterplot showed that historically a few strong markets have supported the company's overall business. Having these clearly identified and maintained could prove to be a good strategy when it comes to further expanding what has already been developed.
- The many regions that have not yet been used so much can learn from the cases that are going well. The Grouped Bar Charts began to identify characteristics that indicate how the larger markets differ from the less successful ones. Closing the gap shown could also mean business success because, in the sense of the scatterplot, markets can hopefully be pushed in a positive direction in terms of investment volume or number of projects.