vignettes/working_with_calculated_metrics.Rmd

---
title: "Analysis Workflows And Calculated Metrics"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Analysis Workflows And Calculated Metrics}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

## A little housekeeping

This vignette uses the Adobe Analytics University Student Access account to show how to work with calculated metrics using R and the API. If you do not have access to the account, you can easily get it by going to the free LinkedIn Learning course, "Adobe Analytics Essential Training", and registering for access.  

If you haven't yet taken the course, it is well worth your time. Eric Matisoff does a great job walking through all the different amazing parts of Adobe Analytics, especially Analysis Workspace.  Specifically, check out the third section of the first chapter to see a quick overview of the calculated metrics builder user interface in action.

```{r, include = FALSE, eval=FALSE}
knitr::opts_chunk$set(
  collapse = FALSE,
  # eval = FALSE,
  warning = FALSE,
  comment = "#>"
)
```

## Setup 

I won't be going into the any detail around authenticate in this article so if you are needing more help with that, make sure to check out the "Getting Started" vignette. With all that being set, let's get things going!  First, we need to load the libraries and authenticate.

```
## Load the packages needed
library(adobeanalyticsr)
library(tidyr) ## needed for some of the data wrangling
library(jsonlite) ## Will help in visualizing calculated metrics definitions

aw_auth('oauth')

##Check to make sure you have been logged into the correct account profile
get_me()


company_id <- 'adobea8cf'
rsid <- 'igeo1xxpnwcidadobepm'

## Create a segment to be used farther down in the demo
#create the segment rule
segment_rule <- seg_rule(dimension = 'mobiledevicename',
                         verb = 'exists')

#build the segment and capture the segment id
seginfo <- seg_build(name = 'NON Desktop traffic', 
                     description = 'This is a segment that filters out everything that is not associated with an identified mobile device', 
                     rules = list(segment_rule), 
                     create_seg = T)
segid <- jsonlite::fromJSON(seginfo)$id
```

| globalCompanyId | companyName |
| -------- | ------- |
| adobea8cf	| Adobe Analytics University Student Access |

## Retrieve multiple calculated metrics

The following example shows a calculated metrics request for a response localized in US English, limited to the first page, and with the size of ten responses per page.

```
cms <- aw_get_calculatedmetrics(includeType = 'all', 
                                rsids = rsid, 
                                company_id = company_id)
knitr::kable(head(cms), format = 'pipe')
```
|rsid                 |id                                   |name                           |description                  |owner     |polarity | precision|type    |
|:--------------------|:------------------------------------|:------------------------------|:----------------------------|:---------|:--------|---------:|:-------|
|igeo1xxpnwcidadobepm |cm300010142_612a3e49262f3f1415082cee |Page                           |                             |200456743 |positive |         0|decimal |
|igeo1xxpnwcidadobepm |cm300010142_61336bad44b1bd4f51a9e2ae |Bounce Rate (Bounces / Visits) |                             |200456743 |positive |         1|percent |
|igeo1xxpnwcidadobepm |cm300010142_61336f20c7e6035b10558588 |Pages Count                    |                             |200456743 |positive |         0|decimal |
|igeo1xxpnwcidadobepm |cm300010142_61416ee549ff383abc37c6ab |cart convert 1                 |cart checkouts per page view |200483715 |positive |         4|decimal |
|igeo1xxpnwcidadobepm |cm300010142_614179ec49ff383abc37c6c3 |Conversions                    |                             |200484198 |positive |         0|decimal |
|igeo1xxpnwcidadobepm |cm300010142_614a8d494ba44e49bb383be5 |cart convert 1                 |                             |200484193 |positive |         4|decimal |

## Retrieving a single calculated metric

To retrieve a single calculated metric, include its id in the request.

```
cm <- aw_get_calculatedmetrics(filterByIds = cms$id[[1]], #add the calculated metric
                               company_id = company_id,
                               rsids = rsid, 
                               expansion = 'definition' #use the expansion argument to see the definition
                               )
knitr::kable(cm)
```

|rsid                 |id                                   |name |description |owner     |polarity | precision|type    |definition          |
|:--------------------|:------------------------------------|:----|:-----------|:---------|:--------|---------:|:-------|:-------------------|
|igeo1xxpnwcidadobepm |cm300010142_612a3e49262f3f1415082cee |Page |            |200456743 |positive |         0|decimal |visualization-group |

## Calculated metric management functions

Calculated Metrics are comprised of several different mathematical functions that work on available metrics for a given report suite.

### Get all functions

Returns a full list of calculated metric functions that the user can access. See the [functions documentation](https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-reference/cm-functions.html?lang=en) for more information on available functions. 

```
all_functions <- get_cm_functions(company_id = company_id)

knitr::kable(head(all_functions[1:6]))
```
|id                     |category |persistable |name                            |description                                                                                       |definition  |
|:----------------------|:--------|:-----------|:-------------------------------|:-------------------------------------------------------------------------------------------------|:-----------|
|col-sum                |basic    |TRUE        |Column Sum                      |Adds all of the numeric values for a metric within a column (across the elements of a dimension). |calc-metric |
|_new-lift-functional   |internal |FALSE       |NA                              |NA                                                                                                |calc-metric |
|_lift-cumul-avg-impl   |internal |FALSE       |NA                              |NA                                                                                                |calc-metric |
|_waskr-N-control       |internal |FALSE       |NA                              |NA                                                                                                |calc-metric |
|ls-intercept-quadratic |advanced |TRUE        |Quadratic regression: Intercept |Quadratic regression: Y = ( a + b X ) ^ 2, Returns a.                                             |calc-metric |
|_chi2-test-stat        |internal |FALSE       |NA                              |NA                                                                                                |calc-metric |

### Create a calculated metric

The process of creating a calculated metric has been designed in a modular way to enable the systematic approach to building and maintaining complex calculated metrics in the most efficient way. The goal is to enhance analysis and data science workflows.  The essential elements of a calculated metric are: function, formula, build.  The following code chunks will illustrate the basics of interacting with each element. 

#### Create a function object

Calculated Metrics are comprised of several different mathematical functions that work on available metrics for a given report suite. See the [functions documentation](https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-reference/cm-functions.html?lang=en) for more information on available functions. 

Currently the package only supports 2 types of functions:

1. Any function that take a single metric
  - Examples of these functions include `Absolute Value (Row)` and `Column Maximum`.  Both of which apply a function on a single metric. For a full list make sure to see the [functions documentation](https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetrics-reference/cm-functions.html).
  
2. The `Approximate Count Distinct (dimension)` function
  - Returns the approximated distinct count of dimension items for the selected dimension. The function uses the HyperLogLog (HLL) method of approximating distinct counts.  It is configured to guarantee the value is within 5% of the actual value 95% of the time. 

```
?cm_function

cm_func <- cm_function(
  func = "col-sum",
  metric = "visits",
  seg_filter = NULL,
  rsid = rsid,
  company_id = company_id
)

jsonlite::toJSON(cm_func, pretty = T, auto_unbox = T)
```
{
  "func": "col-sum",
  "description": "Column Sum",
  "col": {
    "func": "metric",
    "name": "metrics/visits",
    "description": "Visits"
  }
} 

#### Create a formula object

A calculated metric formula takes to metric objects and applies an operator on them. You can nest formula or function objects within formulas to create complex operational containers. Keep in mind, unlike segment containers, these containers function like a math expression and determine the order of operations.

```
?cm_formula

cm_form <- cm_formula(
  operator = 'divide',
  metrics = list(cm_func, "singlepagevisits"),
  seg_filters = NA, #add segment filters for each metric if needed
  rsid = rsid,
  company_id = company_id
)

jsonlite::toJSON(cm_form, pretty = T, auto_unbox = T)
```
{
  "func": "divide",
  "col1": {
    "func": "col-sum",
    "description": "Column Sum",
    "col": {
      "func": "metric",
      "name": "metrics/visits",
      "description": "Visits"
    }
  },
  "col2": {
    "func": "metric",
    "name": "metrics/singlepagevisits",
    "description": "Single Page Visits"
  }
} 

#### Create a calculated metric object

Using the `cm_build` function gives the ability to create and validate simple and complex calculated metrics. If you are familiar with the calculated metrics user interface you should be able to quickly recognize what each aspect of the function arguments refer to but fo those who are less familiar the [calculated metrics builder documentation](https://experienceleague.adobe.com/docs/analytics/components/calculated-metrics/calcmetric-workflow/cm-build-metrics.html?lang=en#areas-of-the-calculated-metrics-builder) may be of great use.

```
?cm_build

cm_obj <- cm_build(
  name = 'Test Calculated Metric',
  description = 'Test cm description',
  formula = cm_form,
  seg_filter = NULL,
  polarity = "positive",
  precision = 0,
  type = "decimal",
  create_cm = FALSE, #should this be created in the UI
  debug = FALSE,
  rsid = rsid,
  company_id = company_id
)

cm_obj
```
{"rsid":"igeo1xxpnwcidadobepm","name":"Test Calculated Metric","description":"Test cm description","definition":{"formula":{"func":"divide","col1":{"func":"col-sum","description":"Column Sum","col":{"func":"metric","name":"metrics/visits","description":"Visits"}},"col2":{"func":"metric","name":"metrics/singlepagevisits","description":"Single Page Visits"}},"version":[1,0,0],"func":"calc-metric"},"polarity":"positive","precision":0,"type":"decimal"} 

### Validate a calculated metric object

Because report suites can have different configurations, dimensions, or metrics, a calculated metric that is valid in one report suite may not be valid in another. To determine which calculated metric to use in different report suites, and why it may or may not be available, you can use the cm_validate. This endpoint allows you to POST a definition along with a target report suite id. The validate endpoint responds with compatibility information on the calculated metric.

```
?cm_val

cm <- cm_val(cm_obj)

cm
```

[1] "The calculated metric definition IS VALID"

Once yo determine make we can create the new calculated metric in the UI by setting the argument `create_cm = TRUE`.

```
#create the calculated metric in the UI
cm_obj <- cm_build(
  name = 'Test Calculated Metric',
  description = 'Test cm description',
  formula = cm_form,
  seg_filter = NULL, #include an overall segment filter
  polarity = "positive",
  precision = 0,
  type = "decimal",
  create_cm = TRUE, #should this be created in the UI
  debug = FALSE,
  rsid = rsid,
  company_id = company_id
)

knitr::kable(cm_obj)
```

{"rsid":"igeo1xxpnwcidadobepm","id":"cm300010142_6537f60dde0e706af6156fb1","name":"Test Calculated Metric","description":"Test cm description","isDeleted":false,"migratedIds":[],"internal":false,"owner":{"id":200654087},"hidden":false,"componentType":"calculatedMetric","polarity":"positive","precision":0,"type":"decimal","definition":{"formula":{"func":"divide","col1":{"func":"col-sum","description":"Column Sum","col":{"func":"metric","name":"metrics/visits","description":"Visits"}},"col2":{"func":"metric","name":"metrics/singlepagevisits","description":"Single Page Visits"}},"func":"calc-metric","version":[1,0,0]},"compatibility":{"identityMetrics":[{"identity":"metrics/visits"},{"identity":"metrics/singlepagevisits"}],"functions":["col-sum","divide"],"validator_version":"1.0.0","supported_products":["oberon"],"supported_schema":["schema_oberon","schema_frag"]},"legacyId":"","categories":["Calculated Metrics"],"modified":"2023-10-24T16:51:25Z"}

### Adding a segment filter 

On every level of a calculated metric, function, formula, and object, you can add an additional level of control by inserting a segment filter.  

```

# use the new segment id to create a segment
cm_obj_seg_filter <- cm_build(
  name = 'Test Segment Filtered Calculated Metric',
  description = 'Test cm description that includes a segment id',
  formula = cm_form,
  seg_filter = segid, #adding the segment here
  create_cm = FALSE,
  rsid = rsid,
  company_id = company_id
)

cm_val(cm_obj_seg_filter)
```

[1] "The calculated metric definition IS VALID"

Now that we know it is a valid segment, let's create it and pull the data along with the previous segment to see the difference in the data.

```
# use the new segment id to create a segment
cm_obj_seg_filter <- cm_build(
  name = 'Test Segment Filtered Calculated Metric',
  description = 'Test cm description that includes a segment id',
  formula = cm_form,
  seg_filter = segid, #adding the segment here
  create_cm = TRUE,
  rsid = rsid,
  company_id = company_id
)

cm_seg_filter <- jsonlite::fromJSON(cm_obj_seg_filter)$id

cm_no_seg_filter <- jsonlite::fromJSON(cm_obj)$id

df_cm <- aw_freeform_table(company_id = company_id, 
                           rsid = rsid, 
                           metrics = c(cm_no_seg_filter, cm_seg_filter),
                           dimensions = 'daterangeday', 
                           prettynames = T)
knitr::kable(df_cm)
```

|Day        | Test Calculated Metric| Test Segment Filtered Calculated Metric|
|:----------|----------------------:|---------------------------------------:|
|2023-09-26 |              3707.0726|                                     Inf|
|2023-09-25 |               460.1371|                                465.8842|
|2023-09-27 |               401.8156|                                406.0459|
|2023-10-20 |               383.0642|                                348.4961|
|2023-10-14 |               373.7211|                                348.4961|

### Update a calculated metric

To update the name or description it is as simple as adding them within a list function of the `updates` argument.
```
?cm_update

cm_updated <- cm_update(
  id = cm_no_seg_filter,
  updates = list(name = "new name",
                 description = "this is a new description"),
  locale = "en_US",
  debug = FALSE,
  company_id = company_id
)
# Name Change
c(`old name` =  jsonlite::fromJSON(cm_obj)$name,`new name` = cm_updated$name)

# Description Change
c(`old description` =  jsonlite::fromJSON(cm_obj)$description,`new description` = cm_updated$description)
```

        old name                           new name 
"Test Calculated Metric"                  "new name" 

      old description                       new description 
  "Test cm description"               "this is a new description"

For updating an existing calculated metrics within the definition there is a more complex solution that needs to be done. 
```

cm_updated <- aw_get_calculatedmetrics(filterByIds = cm_updated$id, 
                                       expansion = 'definition')

#change the col1 values to reflect a new function `mean`
cm_updated$definition$formula$col1$func <- 'mean'
cm_updated$definition$formula$col1$description <- 'Mean'
cm_updated$definition$formula$col1$`include-zeros` <- FALSE

cm_update_complete <- cm_update(
  id = cm_updated$id,
  updates = jsonlite::unbox(cm_updated),
  locale = "en_US",
  debug = FALSE,
  company_id = company_id
)
```
`cm_update_complete$definition$formula$col1$func`
[1] "mean"

`cm_update_complete$definition$formula$col1$description`
[1] "Mean"

### Copy a Calculated Metric
```
cm_var <- aw_get_calculatedmetrics(filterByIds = jsonlite::fromJSON(cm_obj)$id, 
                                   expansion = 'definition')

jsonlite::toJSON(cm_var)

copy_res <- cm_copy(cm_id = cm_var$id, 
                    name = 'Here I copy the cm', 
                    description = 'I want to add a new description', 
                    polarity = 'negative', 
                    precision = 2, 
                    type = 'percent',
                    create_cm = T)

```

`copy_res`
"{"rsid":"igeo1xxpnwcidadobepm","id":"cm300010142_6537f722de0e706af6156fb3","name":"Here I copy the cm","description":"I want to add a new description","isDeleted":false,"migratedIds":[],"internal":false,"owner":{"id":200654087},"hidden":false,"componentType":"calculatedMetric","polarity":"negative","precision":2,"type":"percent","definition":{"formula":{"func":"divide","col1":{"func":"mean","description":"Mean","col":{"func":"metric","name":"metrics/visits","description":"Visits"},"include-zeros":false},"col2":{"func":"metric","name":"metrics/singlepagevisits","description":"Single Page Visits"}},"func":"calc-metric","version":[1,0,0]},"compatibility":{"identityMetrics":[{"identity":"metrics/visits"},{"identity":"metrics/singlepagevisits"}],"functions":["mean","divide"],"validator_version":"1.0.0","supported_products":["oberon"],"supported_schema":["schema_oberon","schema_frag"]},"legacyId":"","categories":["Calculated Metrics"],"modified":"2023-10-24T16:56:02Z"}"

`cm_val(copy_res)`
[1] "The calculated metric definition IS VALID"

### Delete calculated metrics
```
?cm_delete

deleted <- cm_delete(
  cm_id = cm_no_seg_filter,
  warn = FALSE,
  locale = "en_US",
  debug = FALSE,
  rsid = rsid,
  company_id = company_id
)
```
`r message('success: cm300010142_6537f60dde0e706af6156fb1 has been deleted')`

## Conclusion

The set of calculated metric functions are setup to enable the end user to easily incorporate calculated metrics into the analysis and data science workflows without needing to interact with the UI. At the same time, the application of integrating the API in workflows enables a broader set of analytics users to incorporate the results and ongoing analysis in the powerful and versatile Analysis Workspace.  Make sure to add an issue in Github or add a pull request if you find additional opportunities of development that would enhance the ability to analyze and communicate the results using calculated metrics in your workflows.