<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Setup" data-toc-modified-id="Setup-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Setup</a></span><ul class="toc-item"><li><span><a href="#Identifying-the-data-view-to-use-for-your-reports" data-toc-modified-id="Identifying-the-data-view-to-use-for-your-reports-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Identifying the data view to use for your reports</a></span></li></ul></li><li><span><a href="#Creating-a-request-for-a-report" data-toc-modified-id="Creating-a-request-for-a-report-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Creating a request for a report</a></span></li><li><span><a href="#Requesting-the-data" data-toc-modified-id="Requesting-the-data-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Requesting the data</a></span></li><li><span><a href="#Complex-data-extract" data-toc-modified-id="Complex-data-extract-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Complex data extract</a></span></li></ul></div>

Extracting data from Customer Journey Analytics is coming as a feature thanks to the CJA API.\
We will introduce the way to extract data in Customer Journey Analytics.\
Then we will apply some data analysis and data visualization that are not yet possible in CJA.

importing cjapy, pandas for a start and importing the configuration file. 

# Setup 

In [2]:
import cjapy
import pandas as pd
cjapy.importConfigFile('config_example.json')

## Identifying the data view to use for your reports

You need to know which data view to pull the data from.\
Basde on the data view selection, we will be able to extract data by placing a request for certain information.\
The possibility to create request via cjapy is simplify thanks to the `RequestCreator`

In [3]:
cja = cjapy.CJA()

Extracting the id based on the name of the dataview

In [4]:
dataviews = cja.getDataViews()
dv_id = dataviews.at[dataviews[dataviews['name']=='Datanalyst'].index[0],'id']
dv_id

'dv_62c2c7ccb373f55b9f617157'

Extracting the dimensions related to that dataview: 

In [5]:
dimensions = cja.getDimensions(dv_id,full=True)
metrics = cja.getMetrics(dv_id,full=True)

In [10]:
dimensions.sample(4)

Unnamed: 0,id,name,description,sourceFieldId,sourceFieldName,storageId,dataSetIds,dataSetType,schemaType,sourceFieldType,...,fieldDefinition,noValueOptionsSetting,defaultDimensionSort,persistenceSetting,behaviorSetting,substringSetting,baseTableName,required,derivedFieldCompatible,labels
8,variables/adobe_timespentpersession,Time Spent per Session,,,,,,event,,standard,...,,"{'includeNoneByDefault': False, 'noneChangeabl...",True,,,,,,False,
38,variables/timepartquarterofyear,Quarter of Year,,,,,,event,,standard,...,"[{'func': 'raw-field', 'id': 'adobe_datetime',...","{'includeNoneByDefault': False, 'noneChangeabl...",False,,,,hits,True,True,
51,variables/environment.browserDetails.userAgent...,Brand,"The user-agent's commercial name (e.g., cURL, ...",environment.browserDetails.userAgentClientHint...,Brand,explicitbrand|ee84cac1a0c55e121763brands,[6059fd4fc52f8819484a7c1c],event,string,custom,...,"[{'func': 'raw-field', 'id': 'environment.brow...","{'includeNoneByDefault': True, 'noneChangeable...",False,{'enabled': False},{'lowercase': False},{'enabled': False},,,,
0,variables/placeContext.geo.countryCode,Country code,The two-character [ISO 3166-1 alpha-2](https:/...,placeContext.geo.countryCode,Country code,explicitp9f128de542059012c91a65ntrycode|hits,[6059fd4fc52f8819484a7c1c],event,string,custom,...,"[{'func': 'raw-field', 'id': 'placeContext.geo...","{'includeNoneByDefault': True, 'noneChangeable...",False,{'enabled': False},{'lowercase': False},{'enabled': False},,,,


In [8]:
metrics.sample(4)

Unnamed: 0,id,name,description,dataSetType,sourceFieldType,baseTableName,type,hideFromReporting,hasData,segmentable,...,sourceFieldName,storageId,dataSetIds,schemaType,tableName,schemaPath,multiValued,includeExcludeSetting,required,attributionSetting
14,metrics/web.webInteraction.linkClicks.value,web.webInteraction.linkClicks.value,The quantifiable value of this measure.,event,custom,,decimal,False,True,True,...,web.webInteraction.linkClicks.value,explicitw94a6416e5fd321e097a91avalue|hits,[6059fd4fc52f8819484a7c1c],double,hits,web.webInteraction.linkClicks.value,False,{'enabled': False},,
4,metrics/visits,Sessions,,event,standard,,int,,True,True,...,,,,,,,,,True,
5,metrics/placeContext.geo._schema.latitude,Latitude,The signed vertical coordinate of a geographic...,event,custom,,decimal,False,True,True,...,Latitude,explicitpd83d3414456639c973e3fclatitude|hits,[6059fd4fc52f8819484a7c1c],double,hits,placeContext.geo._schema.latitude,False,{'enabled': False},,
7,metrics/occurrences,Events,,event,standard,,int,,True,True,...,,,,,,,,,True,


# Creating a request for a report

`cjapy` provides a module named RequestCreator that simplify the creation of data request from CJA. 

we start by instantiating a report

In [9]:
myReportRequest = cjapy.RequestCreator()

I will define on my request:
* a data view ID 
* the time frame that is being used for the report
* the metrics that I want to extract
* the dimension I want to extract

In [11]:
myReportRequest.setDataViewId(dv_id)
myReportRequest.setDateRange('2024-11-01','2024-11-30')
myReportRequest.addMetric('metrics/occurrences')
myReportRequest.addMetric('metrics/visits')
myReportRequest.setDimension('variables/placeContext.geo.countryCode')

You can always check the definition of your request by realizing a simple request in notebook

In [12]:
myReportRequest

{
    "globalFilters": [
        {
            "type": "dateRange",
            "dateRange": "2024-11-01T00:00:00.000/2024-11-30T23:59:59.999"
        }
    ],
    "metricContainer": {
        "metrics": [
            {
                "columnId": "0",
                "id": "metrics/occurrences",
                "sort": "desc"
            },
            {
                "columnId": "1",
                "id": "metrics/visits"
            }
        ],
        "metricFilters": []
    },
    "dimension": "variables/placeContext.geo.countryCode",
    "settings": {
        "countRepeatInstances": true,
        "limit": 20000,
        "page": 0,
        "nonesBehavior": "exclude-nones",
        "sampling": null,
        "samplingUpSample": null
    },
    "statistics": {
        "functions": [
            "col-max",
            "col-min"
        ]
    },
    "dataId": "dv_62c2c7ccb373f55b9f617157",
    "identityOverrides": [],
    "capacityMetadata": {
        "associations": [
            {

or by using the `to_dict()` method

In [13]:
myReportRequest.to_dict()

{'globalFilters': [{'type': 'dateRange',
   'dateRange': '2024-11-01T00:00:00.000/2024-11-30T23:59:59.999'}],
 'metricContainer': {'metrics': [{'columnId': '0',
    'id': 'metrics/occurrences',
    'sort': 'desc'},
   {'columnId': '1', 'id': 'metrics/visits'}],
  'metricFilters': []},
 'dimension': 'variables/placeContext.geo.countryCode',
 'settings': {'countRepeatInstances': True,
  'limit': 20000,
  'page': 0,
  'nonesBehavior': 'exclude-nones',
  'sampling': None,
  'samplingUpSample': None},
 'statistics': {'functions': ['col-max', 'col-min']},
 'dataId': 'dv_62c2c7ccb373f55b9f617157',
 'identityOverrides': [],
 'capacityMetadata': {'associations': [{'name': 'applicationName',
    'value': 'cjapy Python Library'}]}}

# Requesting the data

Once your data request is ready, you can pass the object directly on the `getReport` method.\
This will return a complex object that we will analyze

In [14]:
myreport = cja.getReport(myReportRequest)

The report will return a `Workspace` object that contains a dataframe attribute.\
It contains all data related to your request.

In [16]:
myreport.dataframe.head()

Unnamed: 0,itemId,variables/placeContext.geo.countryCode,metrics/occurrences,metrics/visits
0,US,US,468.0,327.0
1,IN,IN,347.0,217.0
2,GB,GB,122.0,54.0
3,RU,RU,82.0,82.0
4,CA,CA,71.0,38.0


You can also get the request that has been sent by using `dataRequest`

In [17]:
myreport.dataRequest

{
    "globalFilters": [
        {
            "type": "dateRange",
            "dateRange": "2024-11-01T00:00:00.000/2024-11-30T23:59:59.999"
        }
    ],
    "metricContainer": {
        "metrics": [
            {
                "columnId": "0",
                "id": "metrics/occurrences",
                "sort": "desc"
            },
            {
                "columnId": "1",
                "id": "metrics/visits"
            }
        ],
        "metricFilters": []
    },
    "dimension": "variables/placeContext.geo.countryCode",
    "settings": {
        "countRepeatInstances": false,
        "limit": 20000,
        "page": 0,
        "nonesBehavior": "exclude-nones",
        "sampling": null,
        "samplingUpSample": null
    },
    "statistics": {
        "functions": [
            "col-max",
            "col-min"
        ],
        "ignoreZeroes": false
    },
    "dataId": "dv_62c2c7ccb373f55b9f617157",
    "identityOverrides": [],
    "capacityMetadata": {
       

you can see how many pages has been requested, the `pageRequest` will return the maximum of page returned.\
Each page is 50 K rows

In [18]:
myreport.pageRequested

1

You can get the context of the data by using either:
* globalFilters
* settings
* summaryData

In [19]:
myreport.globalFilters

[{'type': 'dateRange',
  'dateRange': '2024-11-01T00:00:00.000/2024-11-30T23:59:59.999'}]

In [20]:
myreport.settings

{'countRepeatInstances': False,
 'limit': 20000,
 'page': 0,
 'nonesBehavior': 'exclude-nones',
 'sampling': None,
 'samplingUpSample': None}

In [21]:
myreport.summaryData

{'filteredTotals': [1582.0, 1001.0],
 'totals': [1612.0, 1023.0],
 'col-max': [468.0, 327.0],
 'col-min': [1.0, 1.0]}

# Complex data extract