# average revenue and average budget per year
To calculate the average revenue and average budget per year for movies in a pandas DataFrame:
1. Convert the 'release_date' column to datetime.
2. Extract the year from the release date.
3. Group the DataFrame by the year.
4. Calculate the average revenue and budget per year using groupby() and aggregation functions.

In [1]:
import pandas as pd

#### Example DataFrame

In [2]:
data = {
    'title' : ['Movie A', 'Movie B', 'Movie C', 'Movie D', 'Movie E'],
    'release_date' : ['2015-02-09', '2018-05-22', '2018-11-12', '2021-12-15', '2021-05-10'],
    'revenue' : [500000, 1200000, 800000, 2000000, 1800000],
    'budget' : [100000, 300000, 250000, 600000, 500000]
}
df = pd.DataFrame(data)
df

Unnamed: 0,title,release_date,revenue,budget
0,Movie A,2015-02-09,500000,100000
1,Movie B,2018-05-22,1200000,300000
2,Movie C,2018-11-12,800000,250000
3,Movie D,2021-12-15,2000000,600000
4,Movie E,2021-05-10,1800000,500000


#### Step 1: Convert 'release_date' to datetime type

In [3]:
df['release_date'] = pd.to_datetime(df['release_date'])

#### Step 2: Extract the year from 'release_date'


In [4]:
df['year'] = df['release_date'].dt.year

#### Step 3: Group by 'year' and calculate average revenue and average budget


In [8]:
avg_stats_per_year = df.groupby('year').agg(
    avg_revenue = ('revenue','mean'),
    avg_budget = ('budget','mean') 
).reset_index()
avg_stats_per_year

Unnamed: 0,year,avg_revenue,avg_budget
0,2015,500000.0,100000.0
1,2018,1000000.0,275000.0
2,2021,1900000.0,550000.0


- .groupby('year') groups the DataFrame by the 'year' column.
- .agg() allows you to apply aggregation functions (in this case, 'mean') to the grouped data. Here, we're calculating:
- 'avg_revenue': The average of the 'revenue' column.
- 'avg_budget': The average of the 'budget' column.
- .reset_index() resets the index to convert the grouped data into a regular DataFrame.