# Customer satisfaction at the Banks in Saratov, Russia 

My particular interest in the topic connected with I am going to open a bank account online, but in a while, I could use some service on-site in Saratov city in Russia, so as I am from there originally. 
I have not been in Saratov for almost 3 years and don't know what is going on there. 
So, my task is to choose the bank in Saratov, Russia which better deals with their customers.

In [231]:
import psycopg2
import pandas as pd

This reference is how to run SQL quiries from Python: 


- `Read sql query from DB to pandas` - https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html

In [232]:
username = 'olga'
password = 'olga_google'
host = '77.244.65.15'

The access to the database was provided by Arthur Semyonov, who parses the data from the websites of different Russian banks. 

In [233]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

In [234]:
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

In [235]:
conn = psycopg2.connect(
        dbname='dwh',
        user=username,
        password=password,
        host=host,
        port=5432)
cursor = conn.cursor()

For the beginning, let us take a look at the database provided:

In [236]:
query = '''
    select 
        id
        , link 
        , title
        , city 
        , bank_name
        , score 
        , status 
        , username 
        , create_dt 
        , comments
    from home.dt_banki_responses
    order by id desc 
    limit 10
'''
cursor.execute(query)
result = cursor.fetchall()

This function helps us to run postgreSQL queries in Python:

In [237]:
def custom_read_from_sql(query, n):
    df = pd.pandas.read_sql_query(query, conn)
    return df.head(n)

Let us restrict ourselves by the data from the current year (2021):

In [238]:
q = """
    select 
        *
    from 
        home.dt_banki_responses 
    where 
        date(create_dt) >= '2021-01-01'
"""
custom_read_from_sql(q, 5)

Unnamed: 0,id,link,title,city,bank_name,score,status,username,create_dt,comments,content,bank_answer,bank_answer_date,admin_answer,admin_answer_date,parse_dt
0,10562971,https://www.banki.ru/services/responses/bank/r...,Обман и нечестная реклама,г. Москва,Альфа-Банк,1.0,Без статуса,Nikita1995N,2021-09-12 17:57:00+00:00,0,Месяц назад я оформил карту по акции letique к...,XNA,NaT,Спасибо за отзыв. Предлагаем подождать ответ п...,2021-09-22 11:17:00+00:00,2021-10-09
1,10567228,https://www.banki.ru/services/responses/bank/r...,Звонят беспрестанно по чужому кредиту,г. Набережные Челны (Республика Татарстан),Райффайзенбанк,1.0,Не засчитана,JUli22,2021-09-24 16:32:00+00:00,0,Райффайзенбанку я не давала соглашение на обра...,Здравствуйте.Ваш номер телефона был указан у к...,2021-09-28 12:36:00+00:00,"Спасибо за отзыв.Уточните, пожалуйста, предста...",2021-10-01 15:21:00+00:00,2021-10-09
2,10539961,https://www.banki.ru/services/responses/bank/r...,Райфайзен банк положил больше 60000 зачислил 50р!,г. Краснодар (Краснодарский край),Райффайзенбанк,1.0,Без статуса,23rus1111,2021-08-06 11:59:00+00:00,2,На мачуги 2 в банкомате зачислял деньги на кар...,Добрый день.При регистрации претензии сотрудн...,2021-08-09 11:42:00+00:00,"Если средства были возвращены, вы можете отмет...",2021-10-09 16:00:00+00:00,2021-10-09
3,10573895,https://www.banki.ru/services/responses/bank/r...,Вопрос почему новая дата у финуполномоченого?,г. Орехово-Зуево,ВТБ,,Зачтено,Dmitry120,2021-10-14 22:30:00+00:00,5,"Доброго времени суток, у меня вопрос, почему о...",Добрый деньДля проверки информации по отзыву п...,2021-10-15 14:27:00+00:00,XNA,NaT,2021-10-19
4,10538946,https://www.banki.ru/services/responses/bank/r...,Отвратительное отношение!,г. Нижневартовск (Ханты-Мансийский автономный ...,Райффайзенбанк,1.0,Без статуса,L*******@yandex.ru,2021-08-04 09:03:00+00:00,1,"Подал заявку на дебетовую карту, курьер не поз...",Здравствуйте.Мы сожалеем о случившейся ситуаци...,2021-08-24 11:03:00+00:00,XNA,NaT,2021-10-09


Customers have the opportunity to leave references after using the bank's service. They put grades 1-5. Let us calculate which bank is better according to the customer response. 
For now, I will not restrict myself to only banks and customers from Saratov, so as the quality of service depends mostly on how business operations are organized in general, 
then on particular work of a local subsidiary.
So, let us look at which banks answer their customers more quickly, which do not answer at all.

In [239]:
q = """
 with re as (
   select 
    bank_name as nameb, 
    cast(count(create_dt) as decimal) as f_date,
    cast(count(bank_answer_date) as decimal) as resp_date,
    cast(sum(case when bank_answer_date notnull   
            then  0
            else  1
    end) as decimal) as no_resp
 from home.dt_banki_responses
where date(create_dt) >= '2021-01-01'
group by bank_name
     )
select 
 nameb, 
 f_date,
 no_resp,
 resp_date, 
 round(re.no_resp/re.f_date*100,1) || ' %' as perc_no_ans
from re
order by perc_no_ans desc;
"""
custom_read_from_sql(q, 15)

Unnamed: 0,nameb,f_date,no_resp,resp_date,perc_no_ans
0,Райффайзенбанк,1162.0,93.0,1069.0,8.0 %
1,Альфа-Банк,7128.0,466.0,6662.0,6.5 %
2,Хоум Кредит Банк,5548.0,256.0,5292.0,4.6 %
3,ВТБ,10375.0,362.0,10013.0,3.5 %
4,СберБанк,11404.0,356.0,11048.0,3.1 %
5,Газпромбанк,3010.0,85.0,2925.0,2.8 %
6,Тинькофф Банк,26648.0,669.0,25979.0,2.5 %
7,Почта Банк,2440.0,57.0,2383.0,2.3 %
8,Банк Открытие,3076.0,67.0,3009.0,2.2 %
9,МТС Банк,1225.0,6.0,1219.0,0.5 %


Raiffeisen Bank, Alpha Bank, and Home Credit bank have the most % of no-answering to customers' inquiries at all. 

Let us calculate way more important indicator  - the scores that customers give to their banks: 

In [240]:
q = """
with db as
(select id, bank_name, city, score,create_dt,bank_answer_date 
from home.dt_banki_responses
where date(create_dt) >= '2021-01-01'
order by bank_name,city), ref_number as 
  ( select bank_name, count(id) as customer, count(score) as num_ref,
count(id) - count(score) as no_resp,
round((count(id) - count(score))/cast(count(id) as decimal)*100,1) ||'  %' as Percent_no_ref
from db
group by bank_name
order by percent_no_ref)
select bank_name,  
       count(id) as num_transaction, 
       count(score) as Num_score, 
       sum(case when score=5 then 1 else 0 end) as num_of_5,
       sum(case when score=4 then 1 else 0 end) as num_of_4,
       sum(case when score=3 then 1 else 0 end) as num_of_3,
       sum(case when score=2 then 1 else 0 end) as num_of_2,
       sum(case when score=1 then 1 else 0 end) as num_of_1,
       sum(case when score is null then 1 else 0 end) as no_score
from db 
group by bank_name
order by bank_name;
"""
custom_read_from_sql(q, 15)

Unnamed: 0,bank_name,num_transaction,num_score,num_of_5,num_of_4,num_of_3,num_of_2,num_of_1,no_score
0,Альфа-Банк,7128,5959,342,45,164,594,4814,1169
1,Банк Открытие,3076,2681,600,61,78,274,1668,395
2,ВТБ,10375,8968,2726,132,199,815,5096,1407
3,Газпромбанк,3010,2728,496,68,95,281,1788,282
4,МТС Банк,1225,1104,577,32,10,57,428,121
5,Почта Банк,2440,2064,224,22,37,165,1616,376
6,Райффайзенбанк,1162,967,120,15,29,101,702,195
7,СберБанк,11404,8726,1357,68,208,770,6323,2678
8,Тинькофф Банк,26648,25474,19278,833,298,830,4235,1174
9,Точка,724,702,629,32,7,7,27,22


It is not very informative in row numbers, so, let us convert the numbers of different scores into percentage:

In [241]:
q = '''
with saratov as
(select id, bank_name, city, score,create_dt,bank_answer_date 
from home.dth_banki_responses
-- from home.dt_banki_responses
-- where city like '%Сарат%'
where date(create_dt) >= '2021-01-01'
order by bank_name,city), ref_number as 
  ( select bank_name, count(id) as customer, count(score) as num_ref,
count(id) - count(score) as no_resp,
round((count(id) - count(score))/cast(count(id) as decimal)*100,1) ||'  %' as Percent_no_ref
from saratov
group by bank_name
order by percent_no_ref), distr_scores as (
select bank_name,  
       count(id) as num_transaction, 
       count(score) as Num_score, 
       sum(case when score=5 then 1 else 0 end) as num_of_5,
       sum(case when score=4 then 1 else 0 end) as num_of_4,
       sum(case when score=3 then 1 else 0 end) as num_of_3,
       sum(case when score=2 then 1 else 0 end) as num_of_2,
       sum(case when score=1 then 1 else 0 end) as num_of_1,
       sum(case when score is null then 1 else 0 end) as no_score
from saratov
group by bank_name)
select bank_name,
       num_transaction,
       num_score,
       round(num_of_5/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_5,
       round(num_of_4/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_4,
       round(num_of_3/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_3,
       round(num_of_2/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_2,
       round(num_of_1/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_1,
       round(no_score/cast(num_transaction as decimal)*100,1) || ' %' as percent_no_score,
       num_of_5/cast(num_transaction as decimal)*100 + 
       num_of_4/cast(num_transaction as decimal)*100 + 
       num_of_3/cast(num_transaction as decimal)*100 + 
       num_of_2/cast(num_transaction as decimal)*100 + 
       num_of_1/cast(num_transaction as decimal)*100 + 
       no_score/cast(num_transaction as decimal)*100  as total_perc
from distr_scores
order by num_of_5/cast(num_transaction as decimal)
 desc;
'''
custom_read_from_sql(q, 15)


Unnamed: 0,bank_name,num_transaction,num_score,percent_of_5,percent_of_4,percent_of_3,percent_of_2,percent_of_1,percent_no_score,total_perc
0,Точка,789,767,87.1 %,4.6 %,0.9 %,0.9 %,3.8 %,2.8 %,100.0
1,Тинькофф Банк,30650,29367,69.7 %,2.9 %,1.3 %,3.4 %,18.5 %,4.2 %,100.0
2,Хоум Кредит Банк,6786,6448,50.2 %,10.1 %,2.4 %,4.9 %,27.4 %,5.0 %,100.0
3,МТС Банк,1225,1104,47.1 %,2.6 %,0.8 %,4.7 %,34.9 %,9.9 %,100.0
4,ВТБ,14125,12507,22.4 %,1.1 %,2.0 %,8.7 %,54.3 %,11.5 %,100.0
5,Банк Открытие,4010,3576,16.9 %,1.7 %,2.6 %,9.7 %,58.2 %,10.8 %,100.0
6,Газпромбанк,4163,3830,14.4 %,1.8 %,4.6 %,10.4 %,60.9 %,8.0 %,100.0
7,СберБанк,11404,8726,11.9 %,0.6 %,1.8 %,6.8 %,55.4 %,23.5 %,100.0
8,Райффайзенбанк,1459,1255,9.2 %,1.2 %,2.3 %,9.0 %,64.3 %,14.0 %,100.0
9,Почта Банк,2683,2294,8.9 %,0.9 %,1.5 %,6.9 %,67.4 %,14.5 %,100.0


Tinkoff Bank and Home Credit Bank, and MTS Bank seem the best ones in terms of customers' satisfaction. 
Tochka bank shows the best result in terms of the ratio of good grades to a number of references, but the number of transactions is low, so we do not take this bank into consideration. 

Now, when we know banks with the best clients' references nationwide, let see what is going on in Saratov: 

In [242]:
q = '''
with saratov as
(select id, bank_name, city, score,create_dt,bank_answer_date 
from home.dth_banki_responses
-- from home.dt_banki_responses
where city like '%Сарат%'
order by bank_name,city), ref_number as 
  ( select bank_name, count(id) as customer, count(score) as num_ref,
count(id) - count(score) as no_resp,
round((count(id) - count(score))/cast(count(id) as decimal)*100,1) ||'  %' as Percent_no_ref
from saratov
group by bank_name
order by percent_no_ref), distr_scores as (
select bank_name,  
       count(id) as num_transaction, 
       count(score) as Num_score, 
       sum(case when score=5 then 1 else 0 end) as num_of_5,
       sum(case when score=4 then 1 else 0 end) as num_of_4,
       sum(case when score=3 then 1 else 0 end) as num_of_3,
       sum(case when score=2 then 1 else 0 end) as num_of_2,
       sum(case when score=1 then 1 else 0 end) as num_of_1,
       sum(case when score is null then 1 else 0 end) as no_score
from saratov
group by bank_name)
select bank_name,
       num_transaction,
       num_score,
       round(num_of_5/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_5,
       round(num_of_4/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_4,
       round(num_of_3/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_3,
       round(num_of_2/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_2,
       round(num_of_1/cast(num_transaction as decimal)*100,1) || ' %' as percent_of_1,
       round(no_score/cast(num_transaction as decimal)*100,1) || ' %' as percent_no_score,
       num_of_5/cast(num_transaction as decimal)*100 + 
       num_of_4/cast(num_transaction as decimal)*100 + 
       num_of_3/cast(num_transaction as decimal)*100 + 
       num_of_2/cast(num_transaction as decimal)*100 + 
       num_of_1/cast(num_transaction as decimal)*100 + 
       no_score/cast(num_transaction as decimal)*100  as total_perc
from distr_scores
order by num_of_5/cast(num_transaction as decimal)
 desc;
'''
custom_read_from_sql(q, 15)


Unnamed: 0,bank_name,num_transaction,num_score,percent_of_5,percent_of_4,percent_of_3,percent_of_2,percent_of_1,percent_no_score,total_perc
0,МТС Банк,18,18,66.7 %,5.6 %,0.0 %,5.6 %,22.2 %,0.0 %,100.0
1,Тинькофф Банк,393,370,65.6 %,3.1 %,1.8 %,5.1 %,18.6 %,5.9 %,100.0
2,Точка,11,10,63.6 %,18.2 %,0.0 %,9.1 %,0.0 %,9.1 %,100.0
3,Хоум Кредит Банк,359,264,29.8 %,2.8 %,3.6 %,6.1 %,31.2 %,26.5 %,100.0
4,Банк Открытие,326,248,18.7 %,1.8 %,3.4 %,8.9 %,43.3 %,23.9 %,100.0
5,СберБанк,526,386,13.1 %,0.8 %,1.3 %,6.7 %,51.5 %,26.6 %,100.0
6,Альфа-Банк,374,268,12.6 %,0.8 %,1.9 %,8.6 %,47.9 %,28.3 %,100.0
7,Газпромбанк,129,112,9.3 %,3.1 %,0.8 %,14.7 %,58.9 %,13.2 %,100.0
8,ВТБ,313,258,8.0 %,0.3 %,3.5 %,9.9 %,60.7 %,17.6 %,100.0
9,Райффайзенбанк,54,38,3.7 %,0.0 %,1.9 %,16.7 %,48.1 %,29.6 %,100.0


The leaders again are MTC Bank and Tinkoff Bank, but Tochka has a  small number of references, so we will not take it into the account.  

There is information about date of client's inquery and when bank answered on it. Let's calculate the average time to answer the customers, if the answer has been at all. 

In [243]:
q = ''' 
with saratov as (select id, bank_name, city, score,create_dt,bank_answer_date 
from home.dt_banki_responses
where city like '%Сара%' and
date(create_dt) >= '2021-01-01'
order by bank_name,city), time_to_answer as ( 
select bank_name,
       create_dt ,
       bank_answer_date,
       bank_answer_date-create_dt  as time_for_response,
       cast( date_part('day',bank_answer_date-create_dt) *24 *60 +
		date_part('hour',bank_answer_date-create_dt) * 60 +
		date_part('minute',bank_answer_date-create_dt) as integer) as minu
from saratov
where bank_answer_date notnull 
order by bank_name)
select bank_name, sum(time_for_response),
       count(create_dt) number_of_requests,
       sum(time_for_response)/count(create_dt) as aver_response_time
from time_to_answer
group by bank_name
order by aver_response_time;
'''
custom_read_from_sql(q, 15)

Unnamed: 0,bank_name,sum,number_of_requests,aver_response_time
0,СберБанк,26 days 21:10:00,142,0 days 04:32:36.338028
1,МТС Банк,8 days 23:42:00,21,0 days 10:16:17.142857
2,Газпромбанк,16 days 21:42:00,37,0 days 10:57:53.513513
3,Банк Открытие,44 days 01:56:00,33,1 days 08:03:30.909091
4,ВТБ,124 days 07:09:00,74,1 days 16:18:46.216217
5,Хоум Кредит Банк,464 days 16:51:00,98,4 days 17:48:16.530613
6,Альфа-Банк,367 days 13:34:00,61,6 days 00:36:57.049181
7,Тинькофф Банк,2986 days 23:36:00,355,8 days 09:56:12.845071
8,Почта Банк,375 days 16:19:00,37,10 days 03:41:03.243243
9,Райффайзенбанк,119 days 12:01:00,11,10 days 20:43:43.636363


The absolute leader here is the oldest and biggest bank in Russia Sberbank, but MTC Bank, which already shows itself as one of the best in terms of client satisfaction, is in the second place. 

Let us calculate the status mentioned on the website after each inquiry "Problem solved", "Done", "Not done", "Without status":

In [244]:
q = """
    select 
    distinct *, round(cast(status_cnt as numeric) / full_cnt * 100, 1) || ' %' as prcnt
    from (
        select 
            bank_name 
            , status 
            , count(*) over(partition by bank_name, status) as status_cnt
            , count(*) over(partition by bank_name) as full_cnt
        from 
            home.dt_banki_responses
            where city like '%Сара%' and
            date(create_dt) >= '2021-01-01'
        order by bank_name
    ) as t1
    order by bank_name
"""
custom_read_from_sql(q, 20)

Unnamed: 0,bank_name,status,status_cnt,full_cnt,prcnt
0,Альфа-Банк,Без статуса,19,66,28.8 %
1,Альфа-Банк,Зачтено,10,66,15.2 %
2,Альфа-Банк,Не засчитана,25,66,37.9 %
3,Альфа-Банк,Проблема решена,1,66,1.5 %
4,Альфа-Банк,Проверяется,11,66,16.7 %
5,Банк Открытие,Без статуса,13,34,38.2 %
6,Банк Открытие,Зачтено,3,34,8.8 %
7,Банк Открытие,Не засчитана,12,34,35.3 %
8,Банк Открытие,Проблема решена,4,34,11.8 %
9,Банк Открытие,Проверяется,2,34,5.9 %


I don't know the difference between "Problem solved" and "Done". It seems both as positive results. But this query might be useful when we want to check a particular bank. 

# Summary

This investigation helped me to choose a bank in Saratov while I was physically present in the USA for several years. 
I had already known that Sberbank, Raffaizenbak, Alpha Bank, and Tinkoff Bank are well known as good banks, but this study helped me understand what is going on from the customer satisfaction perspective. 
In spite of the national leader Sberbank answering all their clients most quickly from all other banks, the clients are much more satisfied with Tinkoff bank (69& positive feedbacks against 14% in Sberbank). 

I checked the Tinkoff website after this study, they offer really good products in terms of interest rate and % for transferring funds. So, it looks like a good choice.   



# References

In [None]:
1) How to run SQL quiries from Python: https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html
2) The access to database was provided by Arthur Semenov who parses data from banki.ru https://artydev.ru