# Toggl Reports Downloader

Script to Extract from Toggl API and create CSV Export of **Latest and Complete Timelogs** as as well as separate exports of Clients, Projects, Workspace Lists. 

Useful for back up purposes or additional data analysis. 

----

### Add Dependencies

In [2]:
import pandas as pd
from datetime import datetime
from dateutil.parser import parse
import time

In [3]:
# Toggl Wrapper API 
# https://github.com/matthewdowney/TogglPy
import TogglPy

----

## Authentication

In [4]:
import json

with open("credentials.json", "r") as file:
    credentials = json.load(file)
    toggl_cr = credentials['toggl']
    APIKEY = toggl_cr['APIKEY']

In [5]:
toggl = TogglPy.Toggl()
toggl.setAPIKey(APIKEY) 

-----

## User Data

In [6]:
user = toggl.request("https://www.toggl.com/api/v8/me")

In [7]:
user_id = user['data']['id']

In [8]:
user['data']['fullname']

'Markwkoester'

In [9]:
join_date = parse(user['data']['created_at'])
join_date

datetime.datetime(2013, 2, 12, 13, 6, 33, tzinfo=tzutc())

In [10]:
today = datetime.now()
dates = list(pd.date_range(join_date, today))
print("Days Since Joining: " + str(len(dates))) # days since joining

Days Since Joining: 1929


-------

-----

## Clients

In [11]:
user_clients = toggl.request("https://www.toggl.com/api/v8/clients")

In [12]:
clients = pd.DataFrame()
for i in list(range(0, len(user_clients))):
    clients_df_temp = pd.DataFrame.from_dict(user_clients)
    clients = pd.concat([clients_df_temp, clients])

In [13]:
clients.to_csv('data/toggl-clients.csv')

-----

## Workplaces

API Ref: https://github.com/toggl/toggl_api_docs/blob/master/chapters/workspaces.md#get-workspaces

In [14]:
workspaces_list = toggl.request("https://www.toggl.com/api/v8/workspaces")

In [15]:
len(workspaces_list)

3

In [16]:
workspaces = pd.DataFrame.from_dict(workspaces_list)

In [17]:
workspaces.to_csv('data/toggl-workspaces.csv')

----

## Workplace Projects

* API Ref: https://github.com/toggl/toggl_api_docs/blob/master/chapters/workspaces.md#get-workspace-projects
* Endpoint: https://www.toggl.com/api/v8/workspaces/{workspace_id}/projects

In [18]:
projects = pd.DataFrame()
for i in list(range(0, len(workspaces_list))):
    projects_list = toggl.request("https://www.toggl.com/api/v8/workspaces/" + str(workspaces_list[i]['id']) + "/projects")
    projects_df_temp = pd.DataFrame.from_dict(projects_list)
    projects = pd.concat([projects_df_temp, projects])

In [19]:
len(projects)

48

In [20]:
projects.head(3)

Unnamed: 0,active,actual_hours,at,auto_estimates,billable,color,created_at,guid,hex_color,id,is_private,name,template,wid
0,True,41.0,2018-02-16T10:10:03+00:00,False,False,10,2018-02-16T10:10:03+00:00,,#f1c33f,100370156,True,BioMarker Tracker,False,1234339
1,True,38.0,2018-02-16T10:10:17+00:00,False,False,5,2018-02-16T10:10:10+00:00,,#4bc800,100370160,True,PhotoStats App,False,1234339
2,True,7.0,2018-02-16T10:10:30+00:00,False,False,8,2018-02-16T10:10:30+00:00,,#3750b5,100370166,True,Podcast Tracker,False,1234339


In [21]:
# total time of active projects
projects.actual_hours.sum()

6362.0

In [22]:
projects.to_csv('data/toggl-current-projects.csv')

----

# Collect Yearly Export of Detailed Timelogs

In [23]:
def get_detailed_reports(wid, since, until):  # max 365 days
    uid = user_id
    param = {
        'workspace_id': wid,
        'since': since,
        'until': until,
        'uid': uid
    }
    #print(str(workspace_id) + " " + since)
    toggl.getDetailedReportCSV(param, "data/detailed/toggl-detailed-report-" + wid + "-" + since + "-" + until + ".csv")

In [24]:
# years since joinging
last_year = today.year + 1
years = list(range(join_date.year, last_year))
years

[2013, 2014, 2015, 2016, 2017, 2018]

In [25]:
# list of workspace ids
workspace_ids = []
for i in workspaces_list:
    workspace_ids.append(i['id'])
# workspace_ids

In [26]:
workspace_ids

[341257, 373504, 1234339]

In [27]:
# Generate Detail CSV Tester
workspace_id = "373504"
since = "2017-01-01"
until = "2017-12-31"

get_detailed_reports(workspace_id, since, until)

In [28]:
# generate a yearly report for each workspace
for i in workspace_ids:
    wid = str(i)
    for y in years:
        try: 
            since = str(y) + "-01-01" # "2013-01-01"
            until = str(y) + "-12-31" # "2013-12-31"
            print("Generating CSV... " + "for Workspace: " + str(wid) + " from " + since + " until " + until)
            get_detailed_reports(wid, since, until)            
        except:
            print("ERROR On:  " + str(uid) + " " + str(wid) + " from " + since + " until " + until)

Generating CSV... for Workspace: 341257 from 2013-01-01 until 2013-12-31
Generating CSV... for Workspace: 341257 from 2014-01-01 until 2014-12-31
Generating CSV... for Workspace: 341257 from 2015-01-01 until 2015-12-31
Generating CSV... for Workspace: 341257 from 2016-01-01 until 2016-12-31
Generating CSV... for Workspace: 341257 from 2017-01-01 until 2017-12-31
Generating CSV... for Workspace: 341257 from 2018-01-01 until 2018-12-31
Generating CSV... for Workspace: 373504 from 2013-01-01 until 2013-12-31
Generating CSV... for Workspace: 373504 from 2014-01-01 until 2014-12-31
Generating CSV... for Workspace: 373504 from 2015-01-01 until 2015-12-31
Generating CSV... for Workspace: 373504 from 2016-01-01 until 2016-12-31
Generating CSV... for Workspace: 373504 from 2017-01-01 until 2017-12-31
Generating CSV... for Workspace: 373504 from 2018-01-01 until 2018-12-31
Generating CSV... for Workspace: 1234339 from 2013-01-01 until 2013-12-31
Generating CSV... for Workspace: 1234339 from 2014

-----

## Log of Latest Time Entries for that User 

* API Ref: https://github.com/toggl/toggl_api_docs/blob/master/chapters/time_entries.md#get-time-entries-started-in-a-specific-time-range
* Endpoint: https://www.toggl.com/api/v8/time_entries 
* Note: start_date and end_date must be ISO 8601 date and time strings.

In [29]:
# latest_time_entries from last 9 days
latest_time_entries = toggl.request("https://www.toggl.com/api/v8/time_entries")

In [30]:
len(latest_time_entries)

100

In [31]:
latest_time_entries[1]

{'at': '2018-05-17T03:05:34+00:00',
 'billable': False,
 'description': 'Four Essential Things Everyone Should Track',
 'duration': 2928,
 'duronly': False,
 'guid': '65fca5b468e846d81b7a133fffaddc51',
 'id': 875655060,
 'pid': 2759162,
 'start': '2018-05-17T02:16:20+00:00',
 'stop': '2018-05-17T03:05:08+00:00',
 'uid': 440666,
 'wid': 341257}

In [32]:
latest_timelog = pd.DataFrame.from_dict(latest_time_entries)

In [33]:
latest_timelog.tail()

Unnamed: 0,at,billable,description,duration,duronly,guid,id,pid,start,stop,tags,uid,wid
95,2018-05-24T13:42:04+00:00,False,Processing Unprocessed,134,False,af0b996ea87f62a51b274a8c12798528,881994005,2858673,2018-05-24T13:39:49+00:00,2018-05-24T13:42:03+00:00,,440666,341257
96,2018-05-24T13:54:43+00:00,False,Processing Email,751,False,e5153da7964d8182d74a9422e247c545,881997093,2858673,2018-05-24T13:42:11+00:00,2018-05-24T13:54:42+00:00,,440666,341257
97,2018-05-25T01:47:06+00:00,False,Morning Pages - On Writing,594,False,4671ccc5e86a035c4e15aeece10d0d23,882563279,2759162,2018-05-25T01:37:11+00:00,2018-05-25T01:47:05+00:00,,440666,341257
98,2018-05-25T03:07:52+00:00,False,Toggl Data Analysis,4596,False,09b8abdb800641a924627bec1d3015bf,882567659,25620514,2018-05-25T01:50:28+00:00,2018-05-25T03:07:04+00:00,[Coding Studies],440666,341257
99,2018-05-25T04:04:55+00:00,False,Toggl Data Analysis,-1527221094,False,907b76a1cc60561a3809d6e3d7a2346e,882607005,25620514,2018-05-25T04:04:54+00:00,,[Coding Studies],440666,341257


In [34]:
latest_timelog.head()

Unnamed: 0,at,billable,description,duration,duronly,guid,id,pid,start,stop,tags,uid,wid
0,2018-05-17T02:13:54+00:00,False,Morning Pages,244,False,5eb114176143d3692533abeef865d9f6,875652863,2759162,2018-05-17T02:09:49+00:00,2018-05-17T02:13:53+00:00,,440666,341257
1,2018-05-17T03:05:34+00:00,False,Four Essential Things Everyone Should Track,2928,False,65fca5b468e846d81b7a133fffaddc51,875655060,2759162,2018-05-17T02:16:20+00:00,2018-05-17T03:05:08+00:00,,440666,341257
2,2018-05-17T04:21:55+00:00,False,Four Essential Things Everyone Should Track,3654,False,17dc5039539085e479a92db746ef4fb1,875675404,2759162,2018-05-17T03:15:56+00:00,2018-05-17T04:16:50+00:00,,440666,341257
3,2018-05-17T04:35:05+00:00,False,How to Track a Life: Book Writing,483,False,11a73edae8aac156510050bf22a09808,875695923,2759162,2018-05-17T04:26:59+00:00,2018-05-17T04:35:02+00:00,[data-driven you],440666,341257
4,2018-05-17T06:50:46+00:00,False,Site Shutdown Tasks,3080,False,8f47e839d633c444f8f1fd0a5eab7996,875734588,12065403,2018-05-17T05:59:24+00:00,2018-05-17T06:50:44+00:00,,440666,373504


In [35]:
latest_timelog.to_csv('data/toggl-timelog-latest.csv')

-----

# BONUS: Extract Times Entries for Every Single Day Using Toggl API

**NOTE:** A bit of a hackish solution. But this is a possible approach to getting individual day logs. 

In [36]:
extract_date_start = join_date.strftime("%Y-%m-%d") # join date
extract_date_end = today.strftime("%Y-%m-%d") # today

# UNCOMMENT TO Overide Full Extract 
extract_date_start = "2018-05-23"
# extract_date_end = "2018-05-01".strftime("%Y-%m-%d")
# extract_date_end = today.strftime("%Y-%m-%d") # today

# Function that turns datetimes back to strings since that's what the API likes
def date_only(datetimeVal):
      datePart = datetimeVal.strftime("%Y-%m-%d")
      return datePart

# List of Dates of Dates to Extract Time Entries
dates_range = list(pd.date_range(extract_date_start, extract_date_end))
dates_list = [date_only(x) for x in dates_range]

In [37]:
# Extract Timelogs Between Two Dates and Export to a CSV
def toggl_timelog_extractor(input_date1, input_date2):
    date1 = parse(input_date1).isoformat() + '+00:00'
    date2 = parse(input_date2).isoformat() + '+00:00'
    param = {
        'start_date': date1,
        'end_date': date2,
    } 
    try:
        temp_log =  pd.DataFrame.from_dict(toggl.request("https://www.toggl.com/api/v8/time_entries", parameters=param))
        temp_log.to_csv('data/detailed/toggl-time-entries-' + input_date1 + '.csv')
    except: 
        # try again if there is an issue the first time
        temp_log =  pd.DataFrame.from_dict(toggl.request("https://www.toggl.com/api/v8/time_entries", parameters=param))
        temp_log.to_csv('data/daily-detailed/toggl-time-entries-' + input_date1 + '.csv')

In [38]:
# UNCOMMENT to Test Between Two Date
# date1 = '2013-07-23'
# date2 = '2013-07-24'
# toggl_timelog_extractor(date1, date2)

In [39]:
# UNCOMMENT TO RUN
# Extract All Time Entry Data from Previous Days
#for count, item in enumerate(dates_list):
#    if item != dates_list[-1]:
#        date1 = item
#        date2 = (dates_list[count + 1])
#        # print(item + " ~ "+ date2)
#        time.sleep(1)
#        toggl_timelog_extractor(date1, date2)

-----

# Simple Data Analysis  (Using Exported CSV Logs)

In [40]:
import glob
import os

In [41]:
# import all days of time entries and create data frame
path = 'data/detailed/'
allFiles = glob.glob(path + "/*.csv")
timelogs = pd.DataFrame()
list_ = []
for file_ in allFiles:
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
timelog = pd.concat(list_)

In [42]:
timelog.head()

Unnamed: 0,Amount (),Billable,Client,Description,Duration,Email,End date,End time,Project,Start date,Start time,Tags,Task,User
0,,No,,Toggl Data Analysis,01:16:36,markwkoester@gmail.com,2018-05-25,11:07:04,Data Analysis,2018-05-25,09:50:28,Coding Studies,,Markwkoester
1,,No,,Morning Pages - On Writing,00:09:54,markwkoester@gmail.com,2018-05-25,09:47:05,Writing,2018-05-25,09:37:11,,,Markwkoester
2,,No,,Processing Email,00:12:31,markwkoester@gmail.com,2018-05-24,21:54:42,Organizational Work,2018-05-24,21:42:11,,,Markwkoester
3,,No,,Processing Unprocessed,00:02:14,markwkoester@gmail.com,2018-05-24,21:42:03,Organizational Work,2018-05-24,21:39:49,,,Markwkoester
4,,No,,Task Management,00:03:39,markwkoester@gmail.com,2018-05-24,21:39:47,Organizational Work,2018-05-24,21:36:08,,,Markwkoester


In [43]:
len(timelog)

16852

In [44]:
# drop unused columns
timelog = timelog.drop(['Email', 'User', 'Amount ()', 'Client', 'Billable'], axis=1)

In [45]:
# helper functions to convert duration string to seconds
def get_sec(time_str):
    h, m, s = time_str.split(':')
    return int(h) * 3600 + int(m) * 60 + int(s)

# get_sec("01:16:36")

def dur2sec(row):
    return get_sec(row['Duration'])

# timelog.apply(dur2sec, axis=1)

In [46]:
timelog['seconds'] = timelog.apply(dur2sec, axis=1)

In [47]:
timelog.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 16852 entries, 0 to 181
Data columns (total 10 columns):
Description    16829 non-null object
Duration       16852 non-null object
End date       16852 non-null object
End time       16852 non-null object
Project        16727 non-null object
Start date     16852 non-null object
Start time     16852 non-null object
Tags           866 non-null object
Task           0 non-null object
seconds        16852 non-null int64
dtypes: int64(1), object(9)
memory usage: 1.4+ MB


In [48]:
timelog.describe()

Unnamed: 0,seconds
count,16852.0
mean,1896.191253
std,2964.958
min,0.0
25%,670.0
50%,1315.0
75%,2409.0
max,255420.0


In [49]:
timelog.head()

Unnamed: 0,Description,Duration,End date,End time,Project,Start date,Start time,Tags,Task,seconds
0,Toggl Data Analysis,01:16:36,2018-05-25,11:07:04,Data Analysis,2018-05-25,09:50:28,Coding Studies,,4596
1,Morning Pages - On Writing,00:09:54,2018-05-25,09:47:05,Writing,2018-05-25,09:37:11,,,594
2,Processing Email,00:12:31,2018-05-24,21:54:42,Organizational Work,2018-05-24,21:42:11,,,751
3,Processing Unprocessed,00:02:14,2018-05-24,21:42:03,Organizational Work,2018-05-24,21:39:49,,,134
4,Task Management,00:03:39,2018-05-24,21:39:47,Organizational Work,2018-05-24,21:36:08,,,219


In [50]:
timelog.tail()

Unnamed: 0,Description,Duration,End date,End time,Project,Start date,Start time,Tags,Task,seconds
177,BioMarkerDB: Planning and Setup,00:01:25,2018-01-06,15:53:32,Startup Project Misc Work,2018-01-06,15:52:07,,,85
178,BioMarkerDB: Brainstorming,00:22:32,2018-01-05,23:32:28,Startup Project Misc Work,2018-01-05,23:09:56,,,1352
179,VO2 Max Estimator App,00:39:19,2018-01-03,15:16:35,Startup Project Misc Work,2018-01-03,14:37:16,,,2359
180,VO2 Max Estimator App,00:31:46,2018-01-03,14:12:53,Startup Project Misc Work,2018-01-03,13:41:07,,,1906
181,Medical Tourism in Thailand: Research,00:41:24,2018-01-03,13:14:30,Startup Project Misc Work,2018-01-03,12:33:06,,,2484


In [51]:
# Total hours
round((timelog.seconds.sum() / 60 / 60), 1)

8876.3

In [52]:
# total days
round((timelog.seconds.sum() / 60 / 60 / 24), 1)

369.8

In [53]:
timelog.to_csv("data/toggl-detailed-logs-full-export.csv")