**benfordslaw** is Python package to test if an empirical (observed) distribution differs significantly from a theoretical (expected, Benfords) distribution. This notebooks is the show some examples.

[Github benfordslaw](https://github.com/erdogant/benfordslaw)

In [None]:
# Install from pypi
!pip install benfordslaw

In [None]:
# Check version
import benfordslaw
print(benfordslaw.__version__)

In [None]:
# Load library
from benfordslaw import benfordslaw
# Initialize with default parameters
bl = benfordslaw()

In [None]:
# Import USA example
df = bl.import_example(data='elections_usa')

In [None]:
print(df.head())

In [None]:
# Get data for candidate Donald Trump
Iloc = df['candidate']=='Donald Trump'
X = df['votes'].loc[Iloc].values
print(X)

In [None]:
# Test if the empirical observed distribution significantly differs from a theoretical (expected, Benfords) distribution.
results = bl.fit(X)

In [None]:
# Plot
bl.plot(title='Donald Trump')

In [None]:
# Plot but change some figure settings such as color, fontsize and barwidth
bl.plot(title='Donald Trump', barcolor=[0.5,0.5,0.5], fontsize=12, barwidth=0.4)

In [None]:
# Print the Pvalue
bl.results['P']

In [None]:
# Print the t-score
bl.results['t']

In [None]:
# Make the plot yourself given the stored results
import matplotlib.pyplot as plt
plt.plot(bl.results['percentage_emp'][:,0], bl.results['percentage_emp'][:,1], label='Empirical distribution')
plt.plot(bl.results['percentage_emp'][:,0], bl.leading_digits, label='Benfords Distribution')
plt.legend()
plt.grid(True)
plt.xlabel('Leading digits')
plt.ylabel('Frequency')

**Iterate over all candidates in the dataset and make the plots**

In [None]:
# %% USA
df = bl.import_example('elections_usa')
for candidate in df['candidate'].unique():
    Iloc = df['candidate']==candidate
    X = df['votes'].loc[Iloc].values
    bl.fit(X)
    bl.plot(title=candidate)