# Activity File Structure

This notebook explores the structure of the JSON file that stores an activity. This JSON file is the one generated by GoldenCheetah when an activity is imported (from a TCX file or similar)

The goal is to identify the elements and components of an activity, in order to create the needed classes in the code.

In [1]:
import json
import datetime as dt

In [2]:
#filename = '/home/david/Documentos/goldencheetah/david/activities/2018_11_24_10_27_19.json'
filename = '/home/david/Documentos/goldencheetah/david/activities/2020_08_21_11_26_28.json'

In [3]:
with open(filename, 'r', encoding='utf-8-sig') as f:
    data = json.load(f)

## Structure

- Ride
    - Start Time
    - RecIntSecs
    - DeviceType
    - Identifier
    - Tags
    - Intervals
        - Name
        - Start
        - Stop
        - Color
        - PTest
    - Samples
        - Secs
        - Km
        - Watts
        - Cad
        - kph
        - HR
        - Alt
        - Lat
        - Lon
        - Slope

## 1. Ride

The JSON object's root is a `RIDE` key. It contains all the data for the activity.

In [4]:
data.keys()

dict_keys(['RIDE'])

The `RIDE` structure contains different subelemnts:

In [5]:
ride = data['RIDE']
ride.keys()

dict_keys(['STARTTIME', 'RECINTSECS', 'DEVICETYPE', 'IDENTIFIER', 'TAGS', 'INTERVALS', 'SAMPLES'])

### 1.1 - Start Time

The date and time when the activity starts are stored using the key `STARTTIME`.

In [6]:
start = ride['STARTTIME']
start

'2020/08/21 09:26:28 UTC '

`STARTTIME` is a string defining date, time and timezone. The format is `%Y/%m/%d %H:%M:%S UTC `.

> **Note:** the trailing whitespace must be included in the formatter to avoid errors

In [7]:
d = dt.datetime.strptime(start, '%Y/%m/%d %H:%M:%S UTC ')

`STARTIME` is stored as as `str`.

In [8]:
print('Type: {}'.format(type(start).__name__))

Type: str


### 1.2 - Recorded seconds

The recording interval, in seconds, is stored with the key `RECINTSECS`.

In [9]:
recint = ride['RECINTSECS']
recint

1

`RECINTSECS` is stored as as `int`.

In [10]:
print('Type: {}'.format(type(recint).__name__))

Type: int


### 1.3 - Device type

The key `DEVICETYPE` stores the device's namee

In [11]:
device = ride['DEVICETYPE']
device

'Polar V650 '

`DEVICETYPE` is stored as a `str`.

> **Note:** there's a trailing whitespace

In [12]:
print('Type: {}'.format(type(device).__name__))

Type: str


### 1.4 - Identifier

An unused field.

In [13]:
identifier = ride['IDENTIFIER']
identifier

' '

In [14]:
print('Type: {}'.format(type(identifier).__name__))

Type: str


### 1.5 - Tags

The `TAGS` key stores a dictionary containing multiple metrics and data related to the activity:

- The athlete's profile.
- The history of changes done to the file. 
- The route's name.
- The Notes field.
- The name of the json file and the name of the original tcx file.
- Any other data entered in the Details tab.

In [16]:
tags = ride['TAGS']
#tags.keys()

All elements in the dictionary are stored as `str`.

> **Note:** there's a trailing whitespace for every value element in the dictionary.

In [17]:
types = []
for k, v in tags.items():
    t = type(v).__name__
    if t not in types:
        types.append(t)
print('Value types in TAGS:', types)

Value types in TAGS: ['str']


In [19]:
key = 'Distance'
val = tags[key]
print('tags[{}] = "{}" ({})'.format(key, val, type(val).__name__))

tags[Distance] = "0 " (str)


#### Types of tags in the activity

In [20]:
for key, value in tags.items():
    print('{} --> {}'.format(key, value))

Aerobic TISS --> 0 
Anaerobic TISS --> 0 
Athlete --> david 
Average Cadence --> 0 
Average Heart Rate --> 0 
Average Power --> 0 
Average Speed --> 0 
BikeScore™ --> 0 
BikeStress --> 0 
CP --> 0 
Change History --> Cambios en vie. ago. 21 16:26:33 2020:
Cambios en vie. ago. 21 17:38:39 2020:
Cambios en vie. ago. 21 17:40:29 2020:
Cambios en vie. ago. 21 17:51:40 2020:
Cambios en vie. ago. 21 17:53:56 2020:
 
Daniels EqP --> 0 
Daniels Points --> 0 
Data --> TDSPHC-AGL----- 
Device --> Polar V650 
Device Info -->  
Distance --> 0 
Duration --> 0 
Elevation Gain --> 0 
File Format -->  
Filename --> 2020_08_21_11_26_28.json 
GOVSS --> 0 
Keywords -->  
Month --> agosto 
Notes --> La subida a Aralla bastante cómoda, aunque el tramo desde el desvío al túnel al puerto se hizo un poco más duro por haber parado en el túnel.
La subida al Rabizo se hizo muy dura. Sin fuerzas.
Viento de cara a la vuelta, intensidad media. 
Objective -->  
Pool Length --> 0 
RPE --> 6.5 
Route --> La Magdalena 

Tags can be grouped in 3 groups, depending on the data type: `int`, `float` and `string`.

### 1.6 - Intervals

Intervals (laps) are stored as a list of dictionaries, each containing one interval.

In [21]:
intervals = ride['INTERVALS']
intervals[:2]

[{'NAME': 'Int 1 ',
  'START': 0,
  'STOP': 1474,
  'COLOR': '#000000',
  'PTEST': 'false'},
 {'NAME': 'Int 2 ',
  'START': 1513,
  'STOP': 1909,
  'COLOR': '#000000',
  'PTEST': 'false'}]

Intervals include the following keys (elements):

In [22]:
interv = intervals[1]
for k, v in interv.items():
    print('{0}\t{1}\t"{2}"'.format(k, type(v).__name__, v))

NAME	str	"Int 2 "
START	int	"1513"
STOP	int	"1909"
COLOR	str	"#000000"
PTEST	str	"false"


### 1.7 - Samples

Every point in the ride is stored as a dictionary. All the dictionaries (samples) are stored in a list with the key `SAMPLES`.

In [23]:
samples = ride['SAMPLES']
samples[100:102]

[{'SECS': 108.981,
  'KM': 0.649,
  'WATTS': 129,
  'CAD': 97,
  'KPH': 32.4,
  'HR': 146,
  'ALT': 977.074,
  'LAT': 42.78433333,
  'LON': -5.796425,
  'SLOPE': -1.42857},
 {'SECS': 109.981,
  'KM': 0.655,
  'WATTS': 121,
  'CAD': 96,
  'KPH': 21.6,
  'HR': 146,
  'ALT': 977.074,
  'LAT': 42.78433833,
  'LON': -5.79659833,
  'SLOPE': -1.42857}]

Samples include the following keys (elements):

In [24]:
sample = samples[150]
for k, v in sample.items():
    print('{0}\t{1}\t{2}'.format(k, type(v).__name__, v))

SECS	float	158.981
KM	float	1.06
WATTS	int	139
CAD	int	97
KPH	float	32.4
HR	int	146
ALT	float	976.074
LAT	float	42.78419833
LON	float	-5.80137333
SLOPE	int	0
