### Copyright 2024 Google LLC

```
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions andlimitations under the License.
```

# Calculate Partner Revenue by Asset Label using Asset Revenue Reports

Author: Haley Schafer

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/YouTubeLabs/code-samples/blob/main/calculate_revenue_by_asset_label/calculate_revenue_by_asset_label.ipynb)

<br>

This colab to calculate your revenue by asset label is meant to work alongside the [Bulk Download Reports from the Reporting API colab](https://github.com/YouTubeLabs/code-samples/blob/main/authentication_and_report_bulk_download/authentication_and_report_bulk_download.ipynb) in this same [GitHub Repository](https://github.com/YouTubeLabs/code-samples) that downloads all of the available reports from a content owner into a `reports` folder in Google Drive.

<br>
<font color="red" size=5><strong>
Please note that this colab does not currently support asset label calculations for Music Publisher or Music Label content owners.
</strong></font>
<br>

**The default set up of this Colab assumes that you have used our [Bulk Download Reports from the Reporting API colab](https://github.com/YouTubeLabs/code-samples/blob/main/authentication_and_report_bulk_download/authentication_and_report_bulk_download.ipynb) to download your revenue reports from the Reporting API. If you have downloaded the reports another way (e.g. programmatically, on your own, via the API), you may need to make changes to the `REPORT_FOLDER_NAME` variable, and/or the file path pointing to your revenue reports in Google Drive.**


## Types of Revenue by Asset Label

After you've downloaded your revenue reports from the YouTube Reporting API, you can calculate your revenue by asset label. The asset label field is only available in some revenue reports, so not all reports will be used in this script. This script uses the `partner_revenue` and `asset_labels` fields in the following revenue categories:
- [Ads Revenue](https://developers.google.com/youtube/reporting/v1/reports/system_managed/ads#aggregate-ad-revenue-per-asset)
- [Subscription Music Revenue](https://developers.google.com/youtube/reporting/v1/reports/system_managed/subscriptions#monthly-subscriptions-music-assets-revenue)
- [Subscription Non-Music Revenue](https://developers.google.com/youtube/reporting/v1/reports/system_managed/subscriptions#monthly-subscriptions-non-music-assets-revenue)
- [Ads Adjustment Revenue](https://developers.google.com/youtube/reporting/v1/reports/system_managed/ads#monthly-asset-ad-adjustment-revenue-summary)
- [Subscriptions Adjustment Music Revenue](https://developers.google.com/youtube/reporting/v1/reports/system_managed/subscriptions#monthly-music-asset-subscription-adjustment-revenue-raw)
- [Subscriptions Adjustment Non-Music Revenue](https://developers.google.com/youtube/reporting/v1/reports/system_managed/subscriptions#monthly-non-music-asset-subscription-adjustment-revenue-raw)

This colab does not take the following revenue categories or reports into account, and will need to be accounted for separately. Please check your [Payment Summary Report](https://support.google.com/youtube/answer/6085590?hl=en-GB#zippy=%2Cpayment-summary-report) for further details. This list is non-exhaustive.
- [Shorts Revenue](https://support.google.com/youtube/answer/6085590?hl=en-GB#zippy=%2Cshorts-revenue-report%2Cshorts-ads-report%2Cshorts-subscription-report)
- [Transactions](https://support.google.com/youtube/answer/6085590?hl=en-GB#zippy=%2Ctransactions-revenue-report)
- [Paid Features](https://support.google.com/youtube/answer/6085590?hl=en-GB#zippy=%2Cpaid-features-report)
- [Tax Withholding](https://support.google.com/youtube/answer/10391273?hl=en-GB&ref_topic=9257988&sjid=7232590298913281400-EU)
- [Channel Level Adjustments](https://support.google.com/youtube/answer/6085590?hl=en-GB#zippy=%2Cchannel-level-adjustment-report)

<font color="red" size=4><strong>
Use at your own risk: It is your responsibility to check that the revenue calculations of this colab are correct for your revenue and that all applicable revenue categories are accounted for.
</strong></font>

## Once the partner revenue by asset label has been calculated, where can I find the breakdown?

Partner revenue will first be calculated by asset label on an individual report level and saved to a CSV file ending in `_revenue_by_asset_label_raw.csv` in the same folder that your reports are located in. This CSV will have one section per revenue report with the partner revenue by asset label for that report. The report that the section corresponds to you will be indicated at the top of the section.

Then, the total partner revenue by asset label will be calculated across all revenue categories using a report with an asset label column and saved to a CSV file ending in `_revenue_by_asset_label_summary.csv` in the same folder.


## Required input variables

Please provide values for the following two variables using the input selection below.

1. `FIRST_DAY_OF_MONTH` should be the first day of the month for which you want to
calculate revenue. If you have downloaded your revenue reports using the
YouTubeLabs colab to [Download Reports from the YouTube Reporting API](https://github.com/YouTubeLabs/code-samples/blob/main/authentication_and_report_bulk_download/authentication_and_report_bulk_download.ipynb) then your system-managed revenue reports will have this date in the report file name.

2. `REPORT_FOLDER_NAME` should be the name of the folder that all of your reports
are stored in. If you have downloaded the revenue reports using the YouTubeLabs colab to [Download Reports from the YouTube Reporting API](https://github.com/YouTubeLabs/code-samples/blob/main/authentication_and_report_bulk_download/authentication_and_report_bulk_download.ipynb), you do not need to change this value if you downloaded your reports to the default "reports" folder in your Google Drive using that Colab. **If you downloaded your reports to a different location on Google Drive**, then you will need to adjust this value.

3. `CONTENT_OWNER_ID` is the content owner ID of the content owner for which you want to calculate your revenue.

4. `SHOW_TOTALS` If you want the total partner revenue to be shown at the end of each category of revenue calculated, this box should be checked. By default, the totals will be included.

5. `DECOMPRESSED` If your reports have been downloaded as a gzip, and they have not been manually decompressed, you will need to uncheck this box. If your reports are already decompressed (e.g. in a CSV), which is the default value in our colab to colab to [Download Reports from the YouTube Reporting API](https://github.com/YouTubeLabs/code-samples/blob/main/authentication_and_report_bulk_download/authentication_and_report_bulk_download.ipynb), then you do not need to modify this setting.

In [1]:
# First day of the payment month you want to calculate.
FIRST_DAY_OF_MONTH = "" #@param {type:'date'}

#  Name of the folder the reports are in.
REPORT_FOLDER_NAME = "" #@param {type:'string'}

# ID of the Content Owner for which you want to calculate revenue.
CONTENT_OWNER_ID = "" #@param {type:'string'}

# Are your revenue reports decompressed (in a GZIP) files?
# Default is True (total partner revenue will be shown at the bottom of each
# revenue category calculation).
SHOW_TOTALS = True #@param {type: 'boolean'}

# Are your revenue reports decompressed (in a GZIP) files?
# Default is false (not GZIP files).
DECOMPRESSED = True #@param {type: 'boolean'}

## System-Managed (Downloadable Reports) Financial Reports with an asset label columns for Ads and Subscriptions Revenue reports for Non-Music Partners.

Only one report with an `asset_labels` column per revenue category is required. Your content owner may not have revenue from each category every month. Please check your Monthly Payment Summary Report to verify which categories of revenue you have for the month that you're calculating revenue for.

<br>

If you don't have a specific category of revenue for that month, there is usually not a report generated for that revenue category. For example, if you do not have any Subscriptions Adjustment revenue this month, then you will not have the Monthly Subscription (Red) Adjustment Non Music Asset Revenue and Monthly Subscription (Red) Adjustment Music Asset Revenue reports.

<br>

This script will automatically skip over any reports listed below that are not available in the `REPORT_FOLDER_NAME` that you have specified.

<br>

*Note: There is no asset labels field in the Shorts Ads or Subscription Revenue Reports as of August 2024.*

In [2]:
ASSET_LABEL_REPORT_TYPES = [

    # Ads Revenue Asset Summary
    "content_owner_asset_ad_revenue_summary_a1",

    # Monthly Subscriptions Non Music Assets Revenue
    "content_owner_non_music_asset_red_revenue_raw_a1",

    # Monthly Subscriptions Music Assets Revenue
    "content_owner_music_asset_red_revenue_raw_a1",

    # Monthly Ads Adjustment Asset Revenue Summary
    "content_owner_asset_ad_adjustment_revenue_summary_a1",

    # Monthly Subscription (Red) Adjustment Non Music Asset Revenue
    "content_owner_non_music_asset_red_adjustment_revenue_raw_a1",

    # Monthly Subscription (Red) Adjustment Music Asset Revenue
    "content_owner_music_asset_red_adjustment_revenue_raw_a1",

]

## Helper Functions For This Colab

These helper functions will be used throughout this colab to help with date and filepath processing.

In [3]:
from datetime import datetime, timedelta
from dateutil.relativedelta import *

def get_report_date_range(date_input):
  """Return string of report coverage dates based on Colab user input."""
  start_date = datetime.strptime(date_input, "%Y-%m-%d")
  end_date = start_date + relativedelta(months=+1) - timedelta(days=+1)
  return f"{start_date.strftime('%Y%m%d')}_to_{end_date.strftime('%Y%m%d')}"

def get_month_id(date_input):
  """Return the month covered in reporting period in YYYY-MM format."""
  return date_input[:-3]

## Create a list of asset raw CSV reports from the Google Drive folder containing all downloaded reports from the YouTube Reporting API for that date range.

The output of the code cell below confirms where this colab is looking for your revenue reports and for which month.

In [None]:
DATE_RANGE = get_report_date_range(FIRST_DAY_OF_MONTH)
MONTH_ID = get_month_id(FIRST_DAY_OF_MONTH)

print(f"You are calculating revenue for this date range: {DATE_RANGE}.")
print(
    f"This colab is looking for the revenue reports in the "
    f"{REPORT_FOLDER_NAME} folder with the following date suffix in YYYY-MM "
    f"format: {MONTH_ID}"
)

## Mount Your Google Drive and Get Your Revenue Reports Ready

This code mounts your Google Drive so that the colab can access the revenue reports from the `REPORT_FOLDER_NAME` provided. You will need to give this colab permissions to view and create files in your Google Drive.

**If you have not used YouTubeLabs colab to [Download Reports from the YouTube Reporting API](https://github.com/YouTubeLabs/code-samples/blob/main/authentication_and_report_bulk_download/authentication_and_report_bulk_download.ipynb)** to download your revenue reports, then you will need to modify the filenames of your reports as mentioned above to fit the following format: `CONTENT_OWNER_ID_report_type_id_YYYY-MM`.

For example, the file name for the for the content_owner_asset_ad_revenue_summary_a1 report for August 2024 revenue should be formatted like this: `XxXxXxXxXxXxXxXxXxXxXx_content_owner_asset_ad_revenue_summary_a1_2024-08.csv`

In [None]:
from google.colab import drive
import os

drive.mount("/content/drive")
report_directory = f"/content/drive/MyDrive/{REPORT_FOLDER_NAME}/"

def create_report_list():
  """Return list of reports with an asset-label column in REPORT_FOLDER_NAME."""
  rev_reports = []
  for report_type_id in ASSET_LABEL_REPORT_TYPES:
    for report in os.listdir(report_directory):
      if f'{CONTENT_OWNER_ID}_{report_type_id}_{MONTH_ID}' in report:
        rev_reports.append(f'{report_directory}{report}')
  return rev_reports

create_report_list()

## Calculate asset label revenue per table from all asset reports and save each report result to a table in a CSV.

The code below calculates the `partner_revenue` for each unique `asset_label` in your revenue reports and will save the output to a new CSV in your `FOLDER_NAME`. You can idenitfy the report that the asset label revenue corresponds with by the report name at the top of each partner revenue calculation table.

If there is no asset label specified for an asset, the code will automatically attribute the `partner_revenue` for those assets to a "No Asset Label" row.

For example:

`CONTENT_OWNER_ID_content_owner_ads_revenue_summary_a1_MONTH_ID.csv`

| asset_labels         | partner_revenue |
|----------------------|-----------------|
|Label A               | 10.00           |
|Label B               | 15.00           |
|Label C               | 20.00           |
|No Asset Label        | 50.00           |
|Total Partner Revenue | 95.00           |

<br>

`CONTENT_OWNER_ID_content_owner_non_music_asset_red_revenue_raw_a1_MONTH_ID.csv`

| asset_labels          | partner_revenue |
|-----------------------|-----------------|
|Label A                | 12.00           |
|Label B                | 17.00           |
|Label C                | 22.00           |
|Total Partner Revenue  | 51.00           |

<br>
<br>

### Please note that if one asset has multiple labels, then they will apppear in the same row.

For example, if you have an asset with the label `Label A` that generated `$10` of Ads revenue and another asset with two labels: `Label A` and `Label B` that generated `$20` of Ads revenue you will see two columns in the output of this script:

`CONTENT_OWNER_ID_content_owner_ads_revenue_summary_a1_MONTH_ID.csv`

| asset_labels           | partner_revenue |
|------------------------|-----------------|
|Label A                 | 10.00           |
|Label A,Label B         | 11.00           |
|Label B                 | 15.00           |
|Label C                 | 20.00           |
|Total Partner Revenue   | 56.00           |


In [None]:
from pathlib import Path
import pandas as pd
import numpy as np

def calculate_report_level_asset_label_revenue(report_list, output_filename):
  empty_row = {"asset_labels": "", "partner_revenue": ""}
  for report in report_list:
    # Get the filename of the report.
    report_filename = Path(f'{report_directory}{report}').name

    # Create a data frame with the report's filename.
    title_row = pd.DataFrame(data=
                             {"Report File": [report_filename]})

    # Read data from CSV file, fill in "No Asset Label" where there is no
    # asset label value.
    partner_revenue_dataframe = pd.read_csv(report,
                      usecols=["asset_labels", "partner_revenue"]
                     ).fillna("No Asset Label")

    # Show total partner revenue for the report that is being processed.
    total_partner_revenue = partner_revenue_dataframe['partner_revenue'].sum()
    print(f"Partner Revenue for {report_filename} is "
          f"${total_partner_revenue}")

    # Create a title pivot table to add the title of the report above the
    # revenue by asset label.
    title_pivot = pd.pivot_table(title_row, index=["Report File"])

    # Add the title pivot table to the CSV first.
    title_pivot.to_csv(f'{report_directory}{output_filename}',
                       mode="a")

    # Calculate sum of partner revenue by asset label.
    partner_rev_table = pd.pivot_table(partner_revenue_dataframe,
                                       index=["asset_labels"],
                                       values=["partner_revenue"],
                                       aggfunc="sum",
                                       ).sort_values(by=["asset_labels"],
                                                     ascending=False)

    # Append the total revenue for each report to the end of that report section
    # if the user has chosen this setting.
    if SHOW_TOTALS:
      partner_rev_table.loc['Total Partner Revenue'] = total_partner_revenue

    # Add the empty row to the end of the pivot table to force a blank line
    # between each report revenue list.
    partner_rev_table.loc[''] = empty_row

    # Add the revenue by asset label pivot table to the CSV.
    partner_rev_table.to_csv(f'{report_directory}{output_filename}',
                             mode="a")

  return (
          f"Your total asset label revenue calculations are ready in the "
          f"{CONTENT_OWNER_ID}_{DATE_RANGE}_revenue_by_asset_label_raw.csv "
          f"file in the {REPORT_FOLDER_NAME} folder in Google Drive."
         )


calculate_report_level_asset_label_revenue(
    create_report_list(),
    f"{CONTENT_OWNER_ID}_{DATE_RANGE}_revenue_by_asset_label_raw.csv")

## Calculate total asset label revenue from all asset reports and save the result to a CSV.

The code below calculates the revenue by asset label across your available revenue reports with an `asset_labels` column in your `REPORT_FOLDER_NAME`.

The output will appear in the following format:

| asset_labels         | partner_revenue |
|----------------------|-----------------|
|Label A               | 22.00           |
|Label B               | 32.00           |
|Label C               | 42.00           |
|No Asset Label        | 50.00           |
|Total Partner Revenue | 146.00          |

In [None]:
import pandas as pd
import numpy as np

def calculate_total_asset_label_revenue(report_list, output_filename):
  # Concatenate all revenue reports together, fill in "No Asset Label" where
  # there is no asset label value.
  partner_revenue_dataframe = pd.concat((pd.read_csv(report,
                               usecols=["asset_labels", "partner_revenue"]
                               ).fillna("No Asset Label")
                               for report in report_list))

  # Calculate sum of partner revenue by asset label.
  partner_rev_table = pd.pivot_table(partner_revenue_dataframe,
                                     index=["asset_labels"],
                                     values=["partner_revenue"],
                                     aggfunc="sum"
                                     ).sort_values(by=["asset_labels"],
                                                   ascending=False)

  # Show total partner revenue across all reports that are being processed.
  total_partner_revenue = partner_revenue_dataframe['partner_revenue'].sum()
  print(f"Partner Revenue for all reports included in this calculation is "
        f"${total_partner_revenue}")

  # Append the total revenue for each report to the end of that report section
  # if the user has chosen this setting.
  if SHOW_TOTALS:
    partner_rev_table.loc['Total Partner Revenue'] = total_partner_revenue

  # Add the revenue by asset label pivot table to the CSV.
  partner_rev_table.to_csv(f"{report_directory}{output_filename}",
                           mode="a")

  return (
          f"Your total asset label revenue calculations are ready in the "
          f"{CONTENT_OWNER_ID}_{DATE_RANGE}_revenue_by_asset_label_summary.csv "
          f"file in the {REPORT_FOLDER_NAME} folder in Google Drive."
         )

calculate_total_asset_label_revenue(
    create_report_list(),
    f"{CONTENT_OWNER_ID}_{DATE_RANGE}_revenue_by_asset_label_summary.csv")

## That's it!

The CSVs with your partner revenue by asset label for the available reports should now be in your `REPORT_FOLDER_NAME` in Drive for you to view in your spreadsheet program of choice.


<font color="red" size=4><strong>
It is your responsibility to ensure that these calculations are correct and that all revenue sources are included.
</strong></font>

## Disconnect Google Drive from this Colab.

In [None]:
drive.flush_and_unmount()
print("All changes made in this colab session should now be visible in Drive.")