# Toggl Reports Downloader

Script to Extract from Toggl API and create CSV Export of **Latest and Complete Timelogs** as as well as separate exports of Clients, Projects, Workspace Lists. 

Useful for back up purposes or additional data analysis. 

----

### Add Dependencies

In [1]:
import pandas as pd
from datetime import datetime
from dateutil.parser import parse
import time
import pytz

In [2]:
# Toggl Wrapper API 
# https://github.com/matthewdowney/TogglPy
import TogglPy

----

## Authentication

In [88]:
import json

with open("credentials.json", "r") as file:
    credentials = json.load(file)
    toggl_cr = credentials['toggl']
    APIKEY = toggl_cr['APIKEY']

In [89]:
toggl = TogglPy.Toggl()
toggl.setAPIKey(APIKEY) 

-----

## User Data

In [90]:
user = toggl.request("https://www.toggl.com/api/v8/me")

In [91]:
user_id = user['data']['id']

In [92]:
user['data']['fullname']

'Whynotlogic'

In [93]:
join_date = parse(user['data']['created_at'])
join_date = join_date.replace(tzinfo=None)
join_date

datetime.datetime(2016, 10, 23, 11, 0, 17)

In [94]:
today = datetime.now()
dates = list(pd.date_range(join_date, today))
print("Days Since Joining: " + str(len(dates))) # days since joining

Days Since Joining: 968


-----

## Clients

In [95]:
user_clients = toggl.request("https://www.toggl.com/api/v8/clients")

In [96]:
clients = pd.DataFrame()
for i in list(range(0, len(user_clients))):
    clients_df_temp = pd.DataFrame.from_dict(user_clients)
    clients = pd.concat([clients_df_temp, clients])

In [97]:
clients.to_csv('data/toggl-clients.csv')

-----

## Workplaces

API Ref: https://github.com/toggl/toggl_api_docs/blob/master/chapters/workspaces.md#get-workspaces

In [98]:
workspaces_list = toggl.request("https://www.toggl.com/api/v8/workspaces")

In [99]:
len(workspaces_list)

2

In [100]:
workspaces = pd.DataFrame.from_dict(workspaces_list)

In [101]:
workspaces_dict = dict(zip(workspaces.id, workspaces.name))

In [102]:
workspaces.to_csv('data/toggl-workspaces.csv')

----

## Workplace Projects

* API Ref: https://github.com/toggl/toggl_api_docs/blob/master/chapters/workspaces.md#get-workspace-projects
* Endpoint: https://www.toggl.com/api/v8/workspaces/{workspace_id}/projects

In [103]:
projects = pd.DataFrame()
for i in list(range(0, len(workspaces_list))):
    projects_list = toggl.request("https://www.toggl.com/api/v8/workspaces/" + str(workspaces_list[i]['id']) + "/projects")
    projects_df_temp = pd.DataFrame.from_dict(projects_list)
    projects = pd.concat([projects_df_temp, projects])

In [104]:
len(projects)

20

In [105]:
# map workspace name onto projects
projects['workspace_name'] = projects.wid.map(workspaces_dict)

In [106]:
projects.head(3)

Unnamed: 0,active,actual_hours,at,auto_estimates,billable,cid,color,created_at,hex_color,id,is_private,name,template,wid,workspace_name
0,True,10.0,2019-03-21T07:39:44+00:00,False,False,,9,2019-03-21T07:39:44+00:00,#a01aa5,150378408,True,InternalTrelloTogglTest,False,3316671,Smart Process Lab
1,True,,2019-03-20T16:48:42+00:00,False,False,,7,2019-03-20T16:37:17+00:00,#e19a86,150366206,True,SPL Space,False,3316671,Smart Process Lab
2,True,,2019-03-20T16:37:29+00:00,False,False,44010598.0,4,2019-03-20T16:37:29+00:00,#c7741c,150366215,True,USP,False,3316671,Smart Process Lab


In [107]:
# total time of active projects
projects.actual_hours.sum()

683.0

In [108]:
projects.to_csv('data/toggl-current-projects.csv')

----

# Collect Yearly Export of Detailed Timelogs

In [109]:
def get_detailed_reports(wid, since, until):  # max 365 days
    uid = user_id
    param = {
        'workspace_id': wid,
        'since': since,
        'until': until,
        'uid': uid
    }
    #print(str(workspace_id) + " " + since)
    toggl.getDetailedReportCSV(param, "data/detailed/toggl-detailed-report-" + wid + "-" + since + "-" + until + ".csv")

In [110]:
# years since joinging
last_year = today.year + 1
years = list(range(join_date.year, last_year))
years

[2016, 2017, 2018, 2019]

In [111]:
# list of workspace ids
workspace_ids = []
for i in workspaces_list:
    workspace_ids.append(i['id'])
# workspace_ids

In [112]:
workspace_ids

[1721871, 3316671]

In [113]:
# Generate Detail CSV Tester
workspace_id = "3316671"
since = "2019-01-01"
until = "2019-12-31"

get_detailed_reports(workspace_id, since, until)

In [114]:
# generate a yearly report for each workspace
for i in workspace_ids:
    wid = str(i)
    for y in years:
        try: 
            since = str(y) + "-01-01" # "2013-01-01"
            until = str(y) + "-12-31" # "2013-12-31"
            print("Generating CSV... " + "for Workspace: " + str(wid) + " from " + since + " until " + until)
            get_detailed_reports(wid, since, until)            
        except:
            print("ERROR On:  " + str(uid) + " " + str(wid) + " from " + since + " until " + until)

Generating CSV... for Workspace: 1721871 from 2016-01-01 until 2016-12-31
Generating CSV... for Workspace: 1721871 from 2017-01-01 until 2017-12-31
Generating CSV... for Workspace: 1721871 from 2018-01-01 until 2018-12-31
Generating CSV... for Workspace: 1721871 from 2019-01-01 until 2019-12-31
Generating CSV... for Workspace: 3316671 from 2016-01-01 until 2016-12-31
Generating CSV... for Workspace: 3316671 from 2017-01-01 until 2017-12-31
Generating CSV... for Workspace: 3316671 from 2018-01-01 until 2018-12-31
Generating CSV... for Workspace: 3316671 from 2019-01-01 until 2019-12-31


-----

## Log of Latest Time Entries for that User 

* API Ref: https://github.com/toggl/toggl_api_docs/blob/master/chapters/time_entries.md#get-time-entries-started-in-a-specific-time-range
* Endpoint: https://www.toggl.com/api/v8/time_entries 
* Note: start_date and end_date must be ISO 8601 date and time strings.

In [115]:
# latest_time_entries from last 9 days
latest_time_entries = toggl.request("https://www.toggl.com/api/v8/time_entries")

In [116]:
len(latest_time_entries)

11

In [117]:
latest_time_entries[-1]

{'id': 1219471139,
 'guid': 'e861abfdf16c7158b0150710c0c5ac0e',
 'wid': 1721871,
 'pid': 150954342,
 'billable': False,
 'start': '2019-06-17T10:30:00+00:00',
 'stop': '2019-06-17T14:00:00+00:00',
 'duration': 12600,
 'description': 'Write Project Demand SLS',
 'duronly': False,
 'at': '2019-06-17T11:16:27+00:00',
 'uid': 2523441}

In [118]:
latest_timelog = pd.DataFrame.from_dict(latest_time_entries)

In [119]:
latest_timelog.tail()

Unnamed: 0,at,billable,description,duration,duronly,guid,id,pid,start,stop,uid,wid
6,2019-06-13T16:05:15+00:00,False,Seance Personel,7200,False,e879d7959fbd4f1b9da95f7b572bf8fa,1217105514,150034131,2019-06-13T14:00:00+00:00,2019-06-13T16:00:00+00:00,2523441,1721871
7,2019-06-17T09:06:21+00:00,False,USP Data Analytics and Meeting,18000,False,fe282961a01724ac18acda51d0274a5c,1219331435,150366067,2019-06-14T05:00:00+00:00,2019-06-14T10:00:00+00:00,2523441,1721871
8,2019-06-17T09:06:54+00:00,False,Preparation ELN Labo Trimlibs fixes,10800,False,a95992a6d7ba52d030b9f1e4ec035d21,1219332037,150388923,2019-06-14T11:00:00+00:00,2019-06-14T14:00:00+00:00,2523441,1721871
9,2019-06-17T11:15:32+00:00,False,Write Project Demand SLS,18000,False,c059a63412a057f9dfd6d9f903b2a712,1219470044,150954342,2019-06-17T05:00:00+00:00,2019-06-17T10:00:00+00:00,2523441,1721871
10,2019-06-17T11:16:27+00:00,False,Write Project Demand SLS,12600,False,e861abfdf16c7158b0150710c0c5ac0e,1219471139,150954342,2019-06-17T10:30:00+00:00,2019-06-17T14:00:00+00:00,2523441,1721871


In [120]:
latest_timelog.head()

Unnamed: 0,at,billable,description,duration,duronly,guid,id,pid,start,stop,uid,wid
0,2019-06-13T16:01:27+00:00,False,Compensation Day,28800,False,827005b71c250be9b8b6a5e45f331064,1217101074,151295149,2019-06-11T06:00:00+00:00,2019-06-11T14:00:00+00:00,2523441,1721871
1,2019-06-13T03:50:18+00:00,False,Preparation ELN Labo,18000,False,7de46356acbb340ab759ce49c1f8be08,1216338822,150388923,2019-06-12T05:00:00+00:00,2019-06-12T10:00:00+00:00,2523441,1721871
2,2019-06-13T03:51:32+00:00,False,Preparation ELN Labo,18000,False,a3a1650e490f06da83628f074089a92b,1216339202,150388923,2019-06-12T10:00:00+00:00,2019-06-12T15:00:00+00:00,2523441,1721871
3,2019-06-13T03:52:07+00:00,False,Preparation ELN Labo,10800,False,7a971ea849e081de3731138ca599f636,1216339388,150388923,2019-06-12T18:00:00+00:00,2019-06-12T21:00:00+00:00,2523441,1721871
4,2019-06-13T16:00:37+00:00,False,Preparation ELN Labo,21600,False,10fc289bafbb251d7fd0f90b57316ea2,1217099893,150388923,2019-06-13T10:00:00+00:00,2019-06-13T16:00:00+00:00,2523441,1721871


In [121]:
latest_timelog.to_csv('data/toggl-timelog-latest.csv')

-----

# BONUS: Extract Times Entries for Every Single Day Using Toggl API

**NOTE:** A bit of a hackish solution. But this is a possible approach to getting individual day logs. 

In [122]:
extract_date_start = join_date.strftime("%Y-%m-%d") # join date
extract_date_end = today.strftime("%Y-%m-%d") # today

# UNCOMMENT TO Overide Full Extract 
extract_date_start = "2018-05-23"
# extract_date_end = "2018-05-01".strftime("%Y-%m-%d")
# extract_date_end = today.strftime("%Y-%m-%d") # today

# Function that turns datetimes back to strings since that's what the API likes
def date_only(datetimeVal):
      datePart = datetimeVal.strftime("%Y-%m-%d")
      return datePart

# List of Dates of Dates to Extract Time Entries
dates_range = list(pd.date_range(extract_date_start, extract_date_end))
dates_list = [date_only(x) for x in dates_range]

In [123]:
# Extract Timelogs Between Two Dates and Export to a CSV
def toggl_timelog_extractor(input_date1, input_date2):
    date1 = parse(input_date1).isoformat() + '+00:00'
    date2 = parse(input_date2).isoformat() + '+00:00'
    param = {
        'start_date': date1,
        'end_date': date2,
    } 
    try:
        temp_log =  pd.DataFrame.from_dict(toggl.request("https://www.toggl.com/api/v8/time_entries", parameters=param))
        temp_log.to_csv('data/detailed/toggl-time-entries-' + input_date1 + '.csv')
    except: 
        # try again if there is an issue the first time
        temp_log =  pd.DataFrame.from_dict(toggl.request("https://www.toggl.com/api/v8/time_entries", parameters=param))
        temp_log.to_csv('data/daily-detailed/toggl-time-entries-' + input_date1 + '.csv')

In [124]:
# UNCOMMENT to Test Between Two Date
# date1 = '2013-07-23'
# date2 = '2013-07-24'
# toggl_timelog_extractor(date1, date2)

In [125]:
# UNCOMMENT TO RUN
# Extract All Time Entry Data from Previous Days
#for count, item in enumerate(dates_list):
#    if item != dates_list[-1]:
#        date1 = item
#        date2 = (dates_list[count + 1])
#        # print(item + " ~ "+ date2)
#        time.sleep(1)
#        toggl_timelog_extractor(date1, date2)

-----

# Simple Data Analysis  (Using Exported CSV Logs)

In [126]:
import glob
import os

In [127]:
# import all days of time entries and create data frame
path = 'data/detailed/'
allFiles = glob.glob(path + "/*.csv")
timelogs = pd.DataFrame()
list_ = []
for file_ in allFiles:
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
timelog = pd.concat(list_)

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  if __name__ == '__main__':


In [128]:
timelog.head()

Unnamed: 0,Amount (),Billable,Client,Description,Duration,Email,End date,End time,Project,Start date,Start time,Tags,Task,User
0,,No,HESSO,Write Project Demand SLS,03:30:00,whynotlogic@gmail.com,2019-06-17,14:00:00,Aquisition de projects,2019-06-17,10:30:00,,,Whynotlogic
1,,No,HESSO,Write Project Demand SLS,05:00:00,whynotlogic@gmail.com,2019-06-17,10:00:00,Aquisition de projects,2019-06-17,05:00:00,,,Whynotlogic
2,,No,HESSO,Preparation ELN Labo Trimlibs fixes,03:00:00,whynotlogic@gmail.com,2019-06-14,14:00:00,Cours dispensé,2019-06-14,11:00:00,,,Whynotlogic
3,,No,Constellium,USP Data Analytics and Meeting,05:00:00,whynotlogic@gmail.com,2019-06-14,10:00:00,AT - USP,2019-06-14,05:00:00,,,Whynotlogic
4,,No,HESSO,Seance Personel,02:00:00,whynotlogic@gmail.com,2019-06-13,16:00:00,Admin,2019-06-13,14:00:00,,,Whynotlogic


In [129]:
len(timelog)

193

In [130]:
# drop unused columns
timelog = timelog.drop(['Email', 'User', 'Amount ()', 'Client', 'Billable'], axis=1)

In [131]:
# helper functions to convert duration string to seconds
def get_sec(time_str):
    h, m, s = time_str.split(':')
    return int(h) * 3600 + int(m) * 60 + int(s)

# get_sec("01:16:36")

def dur2sec(row):
    return get_sec(row['Duration'])

# timelog.apply(dur2sec, axis=1)

In [132]:
timelog['seconds'] = timelog.apply(dur2sec, axis=1)

In [133]:
timelog.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 193 entries, 0 to 17
Data columns (total 10 columns):
Description    191 non-null object
Duration       193 non-null object
End date       193 non-null object
End time       193 non-null object
Project        190 non-null object
Start date     193 non-null object
Start time     193 non-null object
Tags           74 non-null object
Task           0 non-null object
seconds        193 non-null int64
dtypes: int64(1), object(9)
memory usage: 16.6+ KB


In [134]:
timelog.describe()

Unnamed: 0,seconds
count,193.0
mean,12919.715026
std,7787.048987
min,2.0
25%,7200.0
50%,12600.0
75%,18000.0
max,29700.0


In [135]:
timelog.head()

Unnamed: 0,Description,Duration,End date,End time,Project,Start date,Start time,Tags,Task,seconds
0,Write Project Demand SLS,03:30:00,2019-06-17,14:00:00,Aquisition de projects,2019-06-17,10:30:00,,,12600
1,Write Project Demand SLS,05:00:00,2019-06-17,10:00:00,Aquisition de projects,2019-06-17,05:00:00,,,18000
2,Preparation ELN Labo Trimlibs fixes,03:00:00,2019-06-14,14:00:00,Cours dispensé,2019-06-14,11:00:00,,,10800
3,USP Data Analytics and Meeting,05:00:00,2019-06-14,10:00:00,AT - USP,2019-06-14,05:00:00,,,18000
4,Seance Personel,02:00:00,2019-06-13,16:00:00,Admin,2019-06-13,14:00:00,,,7200


In [136]:
timelog.tail()

Unnamed: 0,Description,Duration,End date,End time,Project,Start date,Start time,Tags,Task,seconds
13,USP: Initial Meeting with Idiap,00:00:02,2019-03-21,09:07:44,InternalTrelloTogglTest,2019-03-21,09:07:42,,,2
14,"Type up and share meeting notes for ""A&T Suppl...",01:23:00,2019-03-21,09:07:02,InternalTrelloTogglTest,2019-03-21,07:44:02,,,4980
15,"Type up and share meeting notes for ""A&T Suppl...",00:00:08,2019-03-21,07:41:30,,2019-03-21,07:41:22,,,8
16,"Type up and share meeting notes for ""A&T Suppl...",00:00:14,2019-03-21,07:38:35,,2019-03-21,07:38:21,,,14
17,Meeting: A&T Supply Chain Brainstorm,02:20:00,2019-03-20,15:40:00,InternalTrelloTogglTest,2019-03-20,13:20:00,,,8400


In [137]:
# Total hours
round((timelog.seconds.sum() / 60 / 60), 1)

692.6

In [138]:
# total days
round((timelog.seconds.sum() / 60 / 60 / 24), 1)

28.9

In [139]:
timelog.to_csv("data/toggl-detailed-logs-full-export.csv")

-----

## Combine to a Daily Project Time Number

In [140]:
# combine to daily number
daily_project_time = timelog.groupby(['Start date'])['seconds'].sum()
print('{:,} total project time data'.format(len(daily_project_time)))
daily_project_time.to_csv('data/daily_project_time.csv')
daily_project_time.tail(5)

81 total project time data


  after removing the cwd from sys.path.


Start date
2019-06-11    28800
2019-06-12    46800
2019-06-13    39600
2019-06-14    28800
2019-06-17    30600
Name: seconds, dtype: int64