# Investigation
- Assume the average person has an average posterior probability of discovering someone is lying
- Reference: The Prevalence of Lying in America: Three Studies of Self-Reported Lies (Kim B. Serota, Timothy R. Levine, & Franklin J. Boster)
    - (M = 1.65 lies per day, SD = 4.45, Mdn = 0, Mode = 0, N = 998, Max = 53 lies, 95% CI = 1.37–1.93)

- Reference: http://liespotting.com/2010/06/10-research-findings-about-deception-that-will-blow-your-mind/
    - We detect lies on average with 54% accuracy
- If the average number of lies we tell per day is 1.65, then over an 8-hour window (roughly half the time we are awake in a day) the probability of being lied to by the average person in that interval is 1.65/2 = 0.825
- If we can detect a lie at 54% accuracy, then the likelyhood of predicting a lie during this 8-hour window is 54% of 0.825 (0.4455)

In [100]:
import scipy.stats
import matplotlib.pyplot as plt
import numpy as np
import itertools
import random

In [44]:
def f2f_lie_freq(num_lies):
    return 135.584 * (num_lies ** -1.301)

In [45]:
# Of the 1000 Adults - 60% reported never lying
lie_frequencies = [0.6*1000]
for num_lies in itertools.count(1):
    freq = f2f_lie_freq(num_lies)
    if freq < 1:
        break
    lie_frequencies.append(freq)

In [46]:
lie_frequencies = np.array(lie_frequencies)

In [47]:
# Create the lie probability distribution function
lie_pdf = lie_frequencies / lie_frequencies.sum()

In [50]:
# Create the lie cumulative probability distribution function
lie_cdf = lie_pdf.cumsum()

In [127]:
def how_many_lies(cdf):
    rand = random.random()
    idx = min(set(range(len(cdf))) - set(np.where(cdf < rand)[0]))
    return idx

In [187]:
def catch_lie(p=0.54):
    rand = random.random()
    return rand < p

In [256]:
lie_occurred, lie_caught = [], []
for day in range(100_000):
    daily_lies = how_many_lies(lie_cdf)
    if daily_lies > 0:
        lie_occurred.append(1)
    else:
        lie_occurred.append(0)
    caught = False
    for lie in range(daily_lies):
        if catch_lie():
            lie_caught.append(1)
            caught = True
            break
    if not caught:
        lie_caught.append(0)

In [257]:
posterior_lie_probability = sum(lie_caught) / len(lie_caught)
daily_lied_to_probability = sum(lie_occurred) / len(lie_occurred)

In [258]:
posterior_lie_probability

0.3135

In [259]:
daily_lied_to_probability

0.39187

# Notes:
- How to include false positive and false negatives on the 54% accuracy of predicting a liar? Find evidence of these rates in literature