# Getting Started
In this example, we have a data frame of transactions from different customers. To get an idea on how the data looks, we preview the data frame.

In [None]:
%matplotlib inline
from featuretools.demo import load_mock_customer
df = load_mock_customer(return_single_table=True)
df.set_index('transaction_time', inplace=True)

df[df.columns[:7]].head()

We want to extract label times for each customer where the label equals the total purchase amount over the next hour of transactions. First, we define the function that will return the total purchase amount given a hour of transactions.

In [None]:
def my_labeling_function(df_slice):
    label = df_slice["amount"].sum()
    return label

With the labeling function, we create the `LabelMaker` for our prediction problem. We need an hour of transactions for each label, so we set `window_size` to one hour.

In [None]:
from composeml import LabelMaker

label_maker = LabelMaker(
    target_entity="customer_id",
    time_index="transaction_time",
    labeling_function=my_labeling_function,
    window_size="1h",
)

With the label maker, we automatically search and extract the labels from the data frame by using `search`.

In [None]:
labels = label_maker.search(
    df,
    minimum_data="1h",
    num_examples_per_instance=25,
    gap=1,
    verbose=True,
)

labels.head()

Next, we make the lables binary by using a `threshold` for total purchase amounts above 1000.

In [None]:
labels = labels.threshold(3500)

labels.head()

We could also take those label times and shift the time 1 hour earlier for predicting in advance.

In [None]:
labels = labels.apply_lead('1h')

labels.head()

With the labels, we could use `describe` to get the distribution and the settings used to make the labels.

In [None]:
labels.describe()

In [None]:
labels.plot.distribution(stacked=True)

In [None]:
labels.plot.count_by_time(figsize=(7, 5))