# 1. Background
![](https://lh3.googleusercontent.com/2EtUdzAuh3MT6gb32-34hIxsWwoAb0ryfJDP22z2PEtL1uoGeh-vUZ4ZFWY4zPyGJw)
From Wikipedia: Patreon (/ˈpeɪtriɒn/, /-ən/) is a membership platform based in the United States that provides business tools for creators to run a subscription content service. It allows creators and artists to earn a monthly income by providing exclusive rewards and perks to their subscribers, or "patrons".

Patreon is used by YouTube videographers, webcomic artists, writers, podcasters, musicians, adult content creators, and other categories of creators who post regularly online. It allows artists to receive funding directly from their fans, or patrons, on a recurring basis or per work of art. The company, started by musician Jack Conte and developer Sam Yam in 2013, is based in San Francisco.

Patreon charges a commission of 5 to 12 percent of creators' monthly income, in addition to payment processing fees. Memberships are billed on the first of each month.

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import plotly as py
import plotly_express as px
import plotly.graph_objects as go
from matplotlib import pyplot as plt
import folium
from folium import plugins
from plotly.offline import init_notebook_mode, iplot
import os
import base64
init_notebook_mode()

df = pd.read_csv('../input/patreon-top-creators/Patreon1-1000.csv', thousands=',')

# 2. Data Exploration

In [None]:
print("Number of Patrons Statistics: ")
print(df.Patrons.describe(percentiles = [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.95,0.99]))

fig1 = px.histogram(df, x = 'Patrons', title = 'Distribution of Number of Patrons')
fig1.show()

fig2 = px.histogram(df[10:], x = 'Patrons', title = 'Distribution of Number of Patrons (Excluding top 10 Patrons)')
fig2.show()

print("Number of Days Running Statistics: ")
print(df.DaysRunning.describe(percentiles = [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.95,0.99]))


fig3 = px.histogram(df, x = 'DaysRunning', title = 'Distribution of Days Running (Days Creator Has Been on Patreon)')
fig3.show()

Is there a relationship between how long the creator has been on the site and how many patrons they have?

In [None]:
fig4 = px.scatter(df, x = 'DaysRunning', y = 'Patrons')
fig4.show()

print("Correlation between DaysRunning and Patrons")
print(df[['DaysRunning', 'Patrons']].corr())

Let's flip the x axis and look at the launch date instead of number of days running. We'll also lose some specificity, since launch date only has the month and year, not date.

In [None]:
df['Launched'] = pd.to_datetime(df['Launched'],format='%b-%y')
fig5 = px.scatter(df, x = 'Launched', y = 'Patrons')
fig5.show()

df_bylaunch = df.groupby('Launched').mean().reset_index()
df_bylaunch.rename(columns = {'Patrons':'Mean Patrons'}, inplace=True)
fig6 = px.scatter(df_bylaunch, x = 'Launched', y = 'Mean Patrons')
fig6.show()