# <p style="background-color:lightgray; font-family:verdana; font-size:250%; text-align:center; border-radius: 15px 20px;">🟠Measurement Problems🟠</p>

<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="left"><font color='#5E5273'>🔍 Why Measurement Problems are important?</font></h3>

### A user's decision to purchase a product is shaped by social proof, known as "the wisdom of crowds."

> Let's say we are about to buy a product and we are stuck between two options. The first product has received a 5-star rating, while the second product has received a 4-star rating. We would prefer to purchase the product that has been rated with 5 stars. 

>  Let's change our scenario a bit. Seven people have endorsed the product with a 5-star rating, while 256 people have endorsed the 4-star product. Now, we would switch our choice from the product with 5 stars to the 4-star product. 

> This illustrates how the strength of social proof can outweigh other features, even when we are certain of its flaws. We tend to prefer those with more votes and comments. It is crucial to make the right ranking in order for the user to reach the best product in terms of price and performance, and for the seller to deliver the products to the users in the most accurate way possible.

# <p style="border-radius:10px; border:#DEB887 solid; padding:25px; background-color: #FFFAF0; font-size:100%;color:#52017A;text-align:center;"> Rating Product  </p>

In [1]:
#importing libraries that we are going to use
import pandas as pd
import math
import scipy.stats as st
from sklearn.preprocessing import  MinMaxScaler

In [2]:
#our data set is an online education platform's courses and their reviews
df = pd.read_csv("/kaggle/input/private-dataset/course_reviews.csv")
df.head().T

Unnamed: 0,0,1,2,3,4
Rating,5.0,5.0,4.5,5.0,4.0
Timestamp,2021-02-05 07:45:55,2021-02-04 21:05:32,2021-02-04 20:34:03,2021-02-04 16:56:28,2021-02-04 15:00:24
Enrolled,2021-01-25 15:12:08,2021-02-04 20:43:40,2019-07-04 23:23:27,2021-02-04 14:41:29,2020-10-13 03:10:07
Progress,5.0,1.0,1.0,10.0,10.0
Questions Asked,0.0,0.0,0.0,0.0,0.0
Questions Answered,0.0,0.0,0.0,0.0,0.0


In [3]:
#Rating distribution / how many of each rating point
df["Rating"].value_counts()

Rating
5.0    3267
4.5     475
4.0     383
3.5      96
3.0      62
1.0      15
2.0      12
2.5      11
1.5       2
Name: count, dtype: int64

In [4]:
#"Distribution of asked questions / how many questions each person has asked
df["Questions Asked"].value_counts()

Questions Asked
0.0     3867
1.0      276
2.0       80
3.0       43
4.0       15
5.0       13
6.0        9
8.0        5
9.0        3
14.0       2
11.0       2
7.0        2
10.0       2
15.0       2
22.0       1
12.0       1
Name: count, dtype: int64

In [5]:
#The average rating of the questions relative to the ratings given

df.groupby("Questions Asked").agg({"Questions Asked": "count",
                                   "Rating":"mean"})

Unnamed: 0_level_0,Questions Asked,Rating
Questions Asked,Unnamed: 1_level_1,Unnamed: 2_level_1
0.0,3867,4.765193
1.0,276,4.740942
2.0,80,4.80625
3.0,43,4.744186
4.0,15,4.833333
5.0,13,4.653846
6.0,9,5.0
7.0,2,4.75
8.0,5,4.9
9.0,3,5.0


In [6]:
#The average score
df["Rating"].mean()

4.764284061993986

> When we take the average directly like this, we may miss the recent satisfaction trend among customers for the relevant products. For instance, if a product initially receives a very high score and later receives a lower score, the prominence of the initial scores would persist, leading to a missed observation of the recent downward trend in scores.

<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="center"><font color='#5E5273'> ⌛ Time-Based Weighted Average Rating ⌛ </font></h3>

In [7]:
#We converted the time variable to a datetime dtype:
df["Timestamp"] = pd.to_datetime(df["Timestamp"])
#Then, we set the current date as the last date in the dataset:
current_date = pd.to_datetime('2021-02-10 0:0:0')
#Subsequently, we subtracted the dates in the dataset from the current date and saved them to the newly created feature,
#skipping the number of days:
df["days"] = (current_date - df["Timestamp"]).dt.days
#We called numbers within 30 days and below:
df[df["days"] <= 30].count()

Rating                194
Timestamp             194
Enrolled              194
Progress              194
Questions Asked       194
Questions Answered    194
days                  194
dtype: int64

In [8]:
#We called the numbers within 30 days and below and to take their average.
df.loc[df["days"] <= 30,"Rating"].mean()

4.775773195876289

In [9]:
#Let's take the average of those greater than 30 days and less than or equal to 90:
df.loc[(df["days"] > 30) & (df["days"] <= 90), "Rating"].mean()

4.763833992094861

In [10]:
#For those greater than 90 and less than or equal to 180:
df.loc[(df["days"] > 90) & (df["days"] <= 180), "Rating"].mean()

4.752503576537912

In [11]:
#All ratings beyond 180 days:
df.loc[df["days"] > 180,"Rating"].mean()

4.76641586867305

> Within 0-30 days, the rating is 4.77. Within 30-90 days, the rating is 4.76. Within 90-180 days, the rating is 4.75. As we can see, there is an increase in the course satisfaction in recent times.

In [12]:
# When we divide the periods as follows: below 30 days by 28%, between 30-90 days by 26%, between 90-180 days by 24%, and the rest 
#by the remaining 22%:
df.loc[df["days"] <= 30,"Rating"].mean() * 28/100 + \
    df.loc[(df["days"] > 30) & (df["days"] <= 90), "Rating"].mean() * 26/100 + \
    df.loc[(df["days"] > 90) & (df["days"] <= 180), "Rating"].mean() * 24/100 + \
    df.loc[df["days"] > 180,"Rating"].mean() * 22/100
# We would measure the average over different time intervals and consider 4.76.

4.765025682267194

In [13]:
def time_based_weighted_average(dataframe, w1=28, w2=26, w3=24,w4=22):
    return dataframe.loc[dataframe["days"] <= 30,"Rating"].mean() * w1 / 100 + \
           dataframe.loc[(dataframe["days"] > 30) & (dataframe["days"] <= 90), "Rating"].mean() * w2 / 100 + \
           dataframe.loc[(dataframe["days"] > 90) & (dataframe["days"] <= 180), "Rating"].mean() * w3 / 100 + \
           dataframe.loc[dataframe["days"] > 180,"Rating"].mean() * w4 / 100

<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="center"><font color='#5E5273'> 🗣️ User-Based(user-quality/ user-rank) Weighted Average 🗣️ </font></h3>

In [14]:
#Should all users' ratings be weighted equally?
#For instance, should a person who has watched the entire course hold the same weight as someone who has only watched 1% of the course?
#There are different ratings for different viewing rates:

df.groupby("Progress").agg({"Rating": "mean"})

#It appears that there is an increase in ratings based on the progress status.

Unnamed: 0_level_0,Rating
Progress,Unnamed: 1_level_1
0.0,4.673913
1.0,4.642691
2.0,4.654762
3.0,4.663551
4.0,4.777328
...,...
94.0,5.000000
95.0,4.794118
97.0,5.000000
98.0,5.000000


In [15]:
df.loc[df["Progress"] <= 10,"Rating"].mean() * 22 / 100 + \
    df.loc[(df["Progress"] > 10) & (df["Progress"] <= 45), "Rating"].mean() * 24/ 100 + \
    df.loc[(df["Progress"] > 45) & (df["Progress"] <= 75), "Rating"].mean() * 26 / 100 + \
    df.loc[df["Progress"] > 75,"Rating"].mean() * 28 / 100

4.800257704672543

### I assigned a weight of 22% to those with progress less than 10, 24% to those between 10-45, 26% to those between 45-75, and 28% to those above 75.

> This resulted in an average of 4.80, as the ratings given by those who watched the entire course were higher. A person who has watched the entire course is likely to have a better understanding of the course, leading to a difference in the ratings they provide compared to those who have watched only a small portion.

In [16]:
def user_based_weighted_average(dataframe, w1=22, w2=24, w3=26, w4=28):
    return  dataframe.loc[dataframe["Progress"] <= 10,"Rating"].mean() * w1 / 100 + \
            dataframe.loc[(dataframe["Progress"] > 10) & (dataframe["Progress"] <= 45), "Rating"].mean() * w2 / 100 + \
            dataframe.loc[(dataframe["Progress"] > 45) & (dataframe["Progress"] <= 75), "Rating"].mean() * w3 / 100 + \
            dataframe.loc[dataframe["Progress"] > 75,"Rating"].mean() * w4 / 100

<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="center"><font color='#5E5273'> 🏋️  Weighted Rating 🏋️ </font></h3>

In [17]:
#in one function
def course_weighted_rating(dataframe, time_w=50, user_w=50):
    return time_based_weighted_average(dataframe) * time_w / 100  + user_based_weighted_average(dataframe)* user_w/100


course_weighted_rating(df)

4.782641693469868

In [18]:
#lets say user quaility is more important for us
course_weighted_rating(df, time_w=40, user_w=60)

4.786164895710403

# <p style="border-radius:10px; border:#DEB887 solid; padding:25px; background-color: #FFFAF0; font-size:100%;color:#52017A;text-align:center;"> Sorting Product  </p>

In [19]:
df = pd.read_csv("/kaggle/input/privatedata3/product_sorting.csv")

df.head().T

Unnamed: 0,0,1,2,3,4
course_name,(50+ Saat) Python A-Z™: Veri Bilimi ve Machine...,Python: Yapay Zeka ve Veri Bilimi için Python ...,5 Saatte Veri Bilimci Olun (Valla Billa),R ile Veri Bilimi ve Machine Learning (35 Saat),(2020) Python ile Makine Öğrenmesi (Machine Le...
instructor_name,Veri Bilimi Okulu,Veri Bilimi Okulu,Instructor_1,Veri Bilimi Okulu,Veri Bilimi Okulu
purchase_count,17380,48291,18693,6626,11314
rating,4.8,4.6,4.4,4.6,4.6
commment_count,4621,4488,2362,1027,969
5_point,3466,2962,1582,688,717
4_point,924,1122,567,257,194
3_point,185,314,165,51,38
2_point,46,45,24,10,10
1_point,6,45,24,21,10


In [20]:
#Sorting by rating:
df.sort_values("rating", ascending=False).head(10)
#We shouldn't solely sort based on the rating; for instance, there are courses with a high rating but a very low comment count. 
#We need a classification that takes into account both the number of purchases,the number of ratings and the number of comments at the same time.

Unnamed: 0,course_name,instructor_name,purchase_count,rating,commment_count,5_point,4_point,3_point,2_point,1_point
0,(50+ Saat) Python A-Z™: Veri Bilimi ve Machine...,Veri Bilimi Okulu,17380,4.8,4621,3466,924,185,46,6
10,İleri Düzey Excel|Dashboard|Excel İp Uçları,Veri Bilimi Okulu,9554,4.8,2266,1654,499,91,22,0
19,Alıştırmalarla SQL Öğreniyorum,Veri Bilimi Okulu,3155,4.8,235,200,31,4,0,0
5,Course_1,Instructor_2,4601,4.8,213,164,45,4,0,0
6,Course_2,Instructor_3,3171,4.7,856,582,205,51,9,9
14,Uçtan Uca SQL Server Eğitimi,Veri Bilimi Okulu,12893,4.7,2425,1722,510,145,24,24
8,A'dan Z'ye Apache Spark (Scala & Python),Veri Bilimi Okulu,6920,4.7,214,154,41,13,2,4
13,Course_5,Instructor_6,6056,4.7,144,82,46,12,1,3
27,Course_15,Instructor_1,1164,4.6,98,65,24,6,0,3
1,Python: Yapay Zeka ve Veri Bilimi için Python ...,Veri Bilimi Okulu,48291,4.6,4488,2962,1122,314,45,45


In [21]:
#When sorted by comments, it seems more appealing compared to sorting by ratings, but similar issues persist. For example, a course might have 
#a high number of purchases but could have been distributed for free.

df.sort_values("commment_count",ascending=False).head(10)

Unnamed: 0,course_name,instructor_name,purchase_count,rating,commment_count,5_point,4_point,3_point,2_point,1_point
0,(50+ Saat) Python A-Z™: Veri Bilimi ve Machine...,Veri Bilimi Okulu,17380,4.8,4621,3466,924,185,46,6
1,Python: Yapay Zeka ve Veri Bilimi için Python ...,Veri Bilimi Okulu,48291,4.6,4488,2962,1122,314,45,45
20,Course_9,Instructor_3,12946,4.5,3371,2191,877,203,33,67
14,Uçtan Uca SQL Server Eğitimi,Veri Bilimi Okulu,12893,4.7,2425,1722,510,145,24,24
2,5 Saatte Veri Bilimci Olun (Valla Billa),Instructor_1,18693,4.4,2362,1582,567,165,24,24
15,Uygulamalarla SQL Öğreniyorum,Veri Bilimi Okulu,11397,4.5,2353,1435,705,165,24,24
10,İleri Düzey Excel|Dashboard|Excel İp Uçları,Veri Bilimi Okulu,9554,4.8,2266,1654,499,91,22,0
3,R ile Veri Bilimi ve Machine Learning (35 Saat),Veri Bilimi Okulu,6626,4.6,1027,688,257,51,10,21
4,(2020) Python ile Makine Öğrenmesi (Machine Le...,Veri Bilimi Okulu,11314,4.6,969,717,194,38,10,10
9,Modern R Programlama Eğitimi,Veri Bilimi Okulu,6537,4.4,901,559,252,72,9,9


<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="center"><font color='#5E5273'> Sorting by Rating,Comment and Purchase </font></h3>

In [22]:
#We compressed the purchase frequencies to a range of 1 to 5:
df["purchase_count_scaled"] = MinMaxScaler(feature_range=(1,5)).fit(df[["purchase_count"]]).transform(df[["purchase_count"]])

#We also compressed the comment count to a range between 1 and 5:
df["comment_count_scaled"] = MinMaxScaler(feature_range=(1,5)).fit(df[["commment_count"]]).transform(df[["commment_count"]])

In [23]:
#Weights can be assigned personally; we gave the highest weight to rating, which is 42%, but it can be changed. 
#We assigned 26% to purchases, but for instance, there might have been free giveaways outside of the purchased course.
#Giving 32% based on comments, 26% based on purchases, and 42% based on ratings, we calculated the scores:
(df["comment_count_scaled"] * 32 / 100 +
 df["purchase_count_scaled"] * 26 / 100 +
 df["rating"] * 42 / 100)

#We examine the scores by assigning weights to see the social proof.

0     4.249884
1     4.795104
2     3.483494
3     2.937105
4     3.022039
5     2.751651
6     2.857214
7     2.522386
8     2.759901
9     2.816233
10    3.427921
11    2.987387
12    2.528686
13    2.721863
14    3.501984
15    3.365772
16    2.517977
17    2.282315
18    2.458066
19    2.726593
20    3.681563
21    2.519056
22    2.538280
23    2.354586
24    2.264794
25    2.021050
26    2.436116
27    2.561682
28    2.544666
29    1.925836
30    1.924000
31    2.273764
dtype: float64

In [24]:
#in one function
def weighted_sorting_score(dataframe, w1=32, w2=26, w3=42):
    return (dataframe["comment_count_scaled"] * w1 / 100 +
            dataframe["purchase_count_scaled"]* w2 / 100 +
            dataframe["rating"] * w3 /100)

df["weighted_sorting_score"] = weighted_sorting_score(df)

In [25]:
#By shaping the weights of the three factors, we made a well-structured ranking.
df.sort_values("weighted_sorting_score", ascending=False).head(10)

Unnamed: 0,course_name,instructor_name,purchase_count,rating,commment_count,5_point,4_point,3_point,2_point,1_point,purchase_count_scaled,comment_count_scaled,weighted_sorting_score
1,Python: Yapay Zeka ve Veri Bilimi için Python ...,Veri Bilimi Okulu,48291,4.6,4488,2962,1122,314,45,45,5.0,4.884699,4.795104
0,(50+ Saat) Python A-Z™: Veri Bilimi ve Machine...,Veri Bilimi Okulu,17380,4.8,4621,3466,924,185,46,6,2.438014,5.0,4.249884
20,Course_9,Instructor_3,12946,4.5,3371,2191,877,203,33,67,2.070512,3.916342,3.681563
14,Uçtan Uca SQL Server Eğitimi,Veri Bilimi Okulu,12893,4.7,2425,1722,510,145,24,24,2.06612,3.096229,3.501984
2,5 Saatte Veri Bilimci Olun (Valla Billa),Instructor_1,18693,4.4,2362,1582,567,165,24,24,2.546839,3.041612,3.483494
10,İleri Düzey Excel|Dashboard|Excel İp Uçları,Veri Bilimi Okulu,9554,4.8,2266,1654,499,91,22,0,1.789374,2.958388,3.427921
15,Uygulamalarla SQL Öğreniyorum,Veri Bilimi Okulu,11397,4.5,2353,1435,705,165,24,24,1.942127,3.03381,3.365772
4,(2020) Python ile Makine Öğrenmesi (Machine Le...,Veri Bilimi Okulu,11314,4.6,969,717,194,38,10,10,1.935248,1.833984,3.022039
11,Course_3,Instructor_4,24809,4.3,250,95,87,51,12,5,3.053749,1.210663,2.987387
3,R ile Veri Bilimi ve Machine Learning (35 Saat),Veri Bilimi Okulu,6626,4.6,1027,688,257,51,10,21,1.546694,1.884265,2.937105


<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="center"><font color='#5E5273'> Bayesian Average Rating Score </font></h3>

In [26]:
#It calculates a weighted probabilistic mean over the rating distributions. 
#We need to provide the 5_point, 4_point, 3_point, 2_point, and 1_point from the dataset.
def bayesian_average_rating(n, confidence=0.95):
    if sum(n) == 0:
        return 0
    K = len(n)
    z = st.norm.ppf(1 - (1 - confidence) / 2)
    N = sum(n)
    first_part = 0.0
    second_part = 0.0
    for k, n_k in enumerate(n):
        first_part += (k+1) * (n[k]+ 1) / (N + K)
        second_part += (k +1 ) * (k +1) * (n[k] + 1) / (N + K)
    score = first_part - z * math.sqrt((second_part - first_part * first_part) / (N + K + 1))
    return score

In [27]:
#They can be named as "Bar Sorting Score," "Bar Rating," and "Bar Average Rating."
df["bar_score"] = df.apply(lambda x: bayesian_average_rating(x[["1_point",
                                                                "2_point",
                                                                "3_point",
                                                                "4_point",
                                                                "5_point"]]), axis=1)

> If I had a reference point for all purchases and comments and had full confidence in them, using the Bayes method would have yielded the most accurate and consistent ranking scientifically.

> Since we will be putting these values into further processes later, and because they are values generated with reference to the distribution, we can call it a "score." The "Bar Score" provides us with a ranking focusing solely on ratings, while the "Weighted Sorting Score," created by considering multiple values, will differ. Both the Bar Score and Weighted Sorting Score can be used.

<div style="border-radius:10px; border:#D0C2F0 solid; padding: 15px; background-color: #FFF0F4; font-size:100%; text-align:left">

<h3 align="center"><font color='#5E5273'> Hybrid Product Sorting </font></h3>

In [28]:
def hybrid_sorting_score(dataframe, bar_w=60, wss_w=40):
    bar_score = dataframe.apply(lambda x: bayesian_average_rating(x[["1_point",
                                                                     "2_point",
                                                                     "3_point",
                                                                     "4_point",
                                                                     "5_point"]]),axis=1)
    wss_score = weighted_sorting_score(dataframe)
    return bar_score*bar_w/100 + wss_score*wss_w/100

In [29]:
#A ranking process has been carried out in a way that is scientific, business-savvy, and also gives a chance to new potential stars.
#By giving 60% weight to the Bar Score, we captured the promising ones.
df["hybrid_sorting_score"] = hybrid_sorting_score(df)
df.sort_values("hybrid_sorting_score", ascending=False).head(5)

Unnamed: 0,course_name,instructor_name,purchase_count,rating,commment_count,5_point,4_point,3_point,2_point,1_point,purchase_count_scaled,comment_count_scaled,weighted_sorting_score,bar_score,hybrid_sorting_score
1,Python: Yapay Zeka ve Veri Bilimi için Python ...,Veri Bilimi Okulu,48291,4.6,4488,2962,1122,314,45,45,5.0,4.884699,4.795104,4.516038,4.627664
0,(50+ Saat) Python A-Z™: Veri Bilimi ve Machine...,Veri Bilimi Okulu,17380,4.8,4621,3466,924,185,46,6,2.438014,5.0,4.249884,4.665857,4.499468
20,Course_9,Instructor_3,12946,4.5,3371,2191,877,203,33,67,2.070512,3.916342,3.681563,4.480627,4.161001
10,İleri Düzey Excel|Dashboard|Excel İp Uçları,Veri Bilimi Okulu,9554,4.8,2266,1654,499,91,22,0,1.789374,2.958388,3.427921,4.641679,4.156176
14,Uçtan Uca SQL Server Eğitimi,Veri Bilimi Okulu,12893,4.7,2425,1722,510,145,24,24,2.06612,3.096229,3.501984,4.568162,4.141691


# <p style="border-radius:10px; border:#DEB887 solid; padding:25px; background-color: #FFFAF0; font-size:100%;color:#52017A;text-align:center;"> Sorting Reviews  </p>

> No matter whether it's a good or bad review, the emphasis should be on highlighting the comments that other people find helpful.

In [30]:
#First we are gonna start with  up-down diff score =  up ratings -  down ratings

def score_up_down_diff(up,down):
    return up - down

In [31]:
#scenerio 1 and 2
#review 1: 600 up 400 down total 1000
#review 2: 5500 up 4500 down total 10000

In [32]:
#review1
score_up_down_diff(600,400)
#%60 positive score

200

In [33]:
score_up_down_diff(5500,4500)
#%55 positive score  score

1000

> Even tough second scenerio have %55 positive score it has 1000 score so it will be recommended which is shouldnt. %60 positive score must be recommended.

In [34]:
#to solve this problem we have second function. (Average RAting)   score  =  up ratings / all ratings
def score_average_rating(up,down):
    if up + down == 0:
        return 0
    return up / (up+down)

In [35]:
score_average_rating(600,400)
#0.6 rate

0.6

In [36]:
score_average_rating(5500, 4500)
#0.55 rate
#we can say it works really great but yet we have another problem...

0.55

In [37]:
#Lets check this scenerio;
#review1: 2 up 0 down total 2
#review2: 100 up 1 down total 101

print(score_average_rating(2,0))
#gives %100 rating 
print(score_average_rating(100, 1))
#and gives %99 rating

#which is not acceptable, 100 up 1 down should be recommended but it recommend 2 up 0 down...
#The frequency didn't correlate with the numerical height

1.0
0.9900990099009901


In [38]:
# Wilson lower bound score
def wilson_lower_bound(up,down,confidence=0.95):
    n = up + down
    if n == 0:
        return 0
    z = st.norm.ppf(1- (1 - confidence) / 2)
    phat = 1.0 * up / n
    return (phat + z * z / (2 * n) - z * math.sqrt((phat * (1 - phat) + z * z / (4 * n)) / n )) / (1 + z * z / n)


> The Wilson Score Interval is a method for calculating the confidence interval of a proportion in a statistical manner. In the context of user reviews or ratings, the Wilson Score can be used to determine the confidence interval of the true underlying proportion of positive ratings given the observed proportion of positive ratings and the sample size. It's particularly useful in cases where the number of ratings or reviews might be small, as it takes into account both the number of positive ratings and the total number of ratings, providing a more accurate representation of the underlying sentiment. The Wilson Lower Bound Score, a variation of the Wilson Score, specifically focuses on the lower bound of the confidence interval, often used for ranking purposes, ensuring a minimum level of confidence in the calculated rating.

In [39]:
#as we can see on wilson lower bound score fixed our problem.
print(wilson_lower_bound(600,400))

print(wilson_lower_bound(5500, 4500))

print(wilson_lower_bound(2, 0))

print(wilson_lower_bound(100,1))

0.5693094295142663
0.5402319557715324
0.3423802275066531
0.9460328420055449
