<a href="https://colab.research.google.com/github/dlont/hep/blob/main/pandas/pandas_transformation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd
import numpy as np

Generate a time range representing one hour of data with measurements every minute

In [None]:
time_range = pd.date_range(start="2024-01-01 00:00", end="2024-01-01 01:00", freq="1min")

Simulate random event rates (number of particle collisions) per minute

In [None]:
np.random.seed(42)  # For reproducibility
event_rates = np.random.poisson(lam=23, size=len(time_range))  # Mean event rate of 20 collisions per minute

Create a DataFrame with time-series data

In [None]:
df = pd.DataFrame({
    'Timestamp': time_range,
    'Event Rate': event_rates
})

Set 'Timestamp' as the index

In [None]:
df.set_index('Timestamp', inplace=True)

1. Mapping Values using map()<br>
Classify the event rate as 'Low', 'Medium', or 'High' based on thresholds

In [None]:
def classify_event_rate(rate):
    if rate < 15:
        return 'Low'
    elif rate > 15 and rate <=25:
        return 'Medium'
    else:
        return 'High'

Apply the classification to each event rate using map()

In [None]:
df['Event Rate Category'] = df['Event Rate'].map(classify_event_rate)
print("\nDataFrame after mapping event rates to categories:")
print(df.head(10))

2. Lambda Functions with apply()<br>
Create a new column that checks if the event rate is above a certain threshold using a lambda function

In [None]:
df['High Event Rate'] = df['Event Rate'].apply(lambda x: 'Yes' if x > 25 else 'No')
print("\nDataFrame after applying a lambda function to flag high event rates:")
print(df.head(10))