# Todoist Completed Tasks Downloader

This project will collect and aggregate all of your completed task data from Todoist. 

-------

## Installation and Setup

#### Download and Install Todoist Python Library

`$ pip install python-todoist`

#### Signup and Create a Todoist App

* Signup at https://developer.todoist.com/appconsole.html
* Once app is created, generate and copy a "Test token," which provides access to API as your user.
* Copy sample-credentials.json and create credentials.json
* Add and Save your Test Token to credentials.json


-----

## Dependencies

In [1]:
from todoist.api import TodoistAPI
import pandas as pd

import matplotlib.pyplot as plt
from datetime import datetime
%matplotlib inline

### Credentials and Authentification

In [2]:
import json

with open("credentials.json", "r") as file:
    credentials = json.load(file)
    todoist_cr = credentials['todoist']
    TOKEN = todoist_cr['TOKEN']

In [65]:
api = TodoistAPI(TOKEN)
# api.sync()

----------

## Check Basic User Info

In [4]:
user = api.state['user']

In [5]:
user['full_name']

'Mark Koester'

In [6]:
# Tasks Completed Today
user['completed_today']

1

In [7]:
# total completed tasks
user_completed_count = user['completed_count']
user_completed_count

3869

-------

# List and Export of Current Projects

### API Call: api.state['projects']

https://developer.todoist.com/sync/v7/#get-all-projects

NOTE: This only gets info on your existing projects and exludes archived projects. 

In [8]:
user_projects  = api.state['projects']

In [66]:
# user_projects

In [10]:
len(user_projects)

41

In [11]:
with open('data/todoist-projects.csv', 'w') as file:
    file.write("Id" + "," + "Project" + "\n")
    for i in range(0, len(user_projects)):
        file.write('\"' + str(user_projects[i]['id']) + '\"' + "," + '\"' + str(user_projects[i]['name']) + '\"' + "\n")

In [12]:
projects = pd.read_csv("data/todoist-projects.csv")

In [13]:
# projects

-----

## User Completed Tasks Stats Info

API Call: `api.completed.get_stats()` https://developer.todoist.com/sync/v7/#get-productivity-stats

In [14]:
stats = api.completed.get_stats()

In [15]:
# total completed tasks from stats
user_completed_stats = stats['completed_count']
user_completed_stats

3869

-------

# Collect Raw List of All Completed Items from Todoist

### API Call: api.completed.get_all() 

https://developer.todoist.com/sync/v7/#get-all-completed-items

In [16]:
def get_completed_todoist_items():
    # create df from initial 50 completed tasks
    print("Collecting Initial 50 Completed Todoist Tasks...")
    temp_tasks_dict = (api.completed.get_all(limit=50))
    past_tasks = pd.DataFrame.from_dict(temp_tasks_dict['items'])
    # get the remaining items
    pager = list(range(50,user_completed_count,50))
    for count, item in enumerate(pager):
        tmp_tasks = (api.completed.get_all(limit=50, offset=item))
        tmp_tasks_df = pd.DataFrame.from_dict(tmp_tasks['items'])
        past_tasks = pd.concat([past_tasks, tmp_tasks_df])
        print("Collecting Additional Todoist Tasks " + str(item) + " of " + str(user_completed_count))
    # save to CSV
    print("...Generating CSV Export")
    past_tasks.to_csv("data/todost-raw-tasks-completed.csv", index=False)

In [17]:
get_completed_todoist_items()

Collecting Initial 50 Completed Todoist Tasks...
Collecting Additional Todoist Tasks 50 of 3869
Collecting Additional Todoist Tasks 100 of 3869
Collecting Additional Todoist Tasks 150 of 3869
Collecting Additional Todoist Tasks 200 of 3869
Collecting Additional Todoist Tasks 250 of 3869
Collecting Additional Todoist Tasks 300 of 3869
Collecting Additional Todoist Tasks 350 of 3869
Collecting Additional Todoist Tasks 400 of 3869
Collecting Additional Todoist Tasks 450 of 3869
Collecting Additional Todoist Tasks 500 of 3869
Collecting Additional Todoist Tasks 550 of 3869
Collecting Additional Todoist Tasks 600 of 3869
Collecting Additional Todoist Tasks 650 of 3869
Collecting Additional Todoist Tasks 700 of 3869
Collecting Additional Todoist Tasks 750 of 3869
Collecting Additional Todoist Tasks 800 of 3869
Collecting Additional Todoist Tasks 850 of 3869
Collecting Additional Todoist Tasks 900 of 3869
Collecting Additional Todoist Tasks 950 of 3869
Collecting Additional Todoist Tasks 1000

In [18]:
past_tasks = pd.read_csv("data/todost-raw-tasks-completed.csv")

In [19]:
past_tasks.head()

Unnamed: 0,completed_date,content,id,meta_data,project_id,task_id,user_id
0,Thu 24 May 2018 02:07:14 +0000,META: My Studies Check-in,2526803206,,1252539539,2526803206,4288657
1,Wed 23 May 2018 14:29:25 +0000,Shareable Code for Downloading Completed Todoi...,2663068168,,2165379308,2663068168,4288657
2,Wed 23 May 2018 11:26:24 +0000,Initial Code for Exporting Todoist Completed T...,2662789071,,2165379308,2662789071,4288657
3,Wed 23 May 2018 11:26:24 +0000,Edit and Share Last.fm Code for QS Ledger,2662788785,,2165379308,2662788785,4288657
4,Wed 23 May 2018 11:26:21 +0000,Edit and Share Fitbit Code for QS Ledger,2662788601,,2165379308,2662788601,4288657


In [20]:
# generated count 
collected_total = len(past_tasks)
collected_total

3869

In [21]:
# Does our collected total tasks match stat of completed count on user
collected_total == user_completed_count

True

In [22]:
past_tasks['project_id'] = past_tasks.project_id.astype('category')

In [23]:
past_tasks.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3869 entries, 0 to 3868
Data columns (total 7 columns):
completed_date    3869 non-null object
content           3869 non-null object
id                3869 non-null int64
meta_data         0 non-null float64
project_id        3869 non-null category
task_id           3869 non-null int64
user_id           3869 non-null int64
dtypes: category(1), float64(1), int64(3), object(2)
memory usage: 188.2+ KB


In [24]:
len(past_tasks.project_id.unique())

59

---------

## Get All Current and Previous Projects

In [25]:
# Extract all project ids used on tasks
project_ids = past_tasks.project_id.unique()
# project_ids

In [26]:
# total all-time projects
len(project_ids)

59

In [27]:
# get project info from Todoist API
def get_todoist_project_name(project_id):
    item = api.projects.get_by_id(project_id)
    if item: 
        try:
            return item['name']
        except:
            return item['project']['name']

In [28]:
# Testing with a Sample Archived Project
get_todoist_project_name(183682060)

'Math'

In [29]:
# Testing with a Sample Current Project
get_todoist_project_name(1252539618)

'Writing'

In [30]:
# Get Info on All User Projects
project_names = []
for i in project_ids:
    project_names.append(get_todoist_project_name(i))

In [31]:
# project_names

-----

## Match Project Id Name on Completed Tasks, Add Day of Week

In [32]:
past_tasks.tail()

Unnamed: 0,completed_date,content,id,meta_data,project_id,task_id,user_id
3864,Sun 28 Aug 2016 12:07:13 +0000,Read Checklist from MASTER THE GAME,53281331,,178797715,53281331,4288657
3865,Sun 28 Aug 2016 10:40:51 +0000,Weekly Review,53277061,,142200795,53277061,4288657
3866,Sun 28 Aug 2016 10:24:01 +0000,Financial Reflection Writing,53275224,,142200795,53275224,4288657
3867,Sun 28 Aug 2016 10:17:51 +0000,Study Todoist Shortcuts,53273289,,142200795,53273289,4288657
3868,Sun 28 Aug 2016 06:39:21 +0000,Review & Setup Tasks on TODOIST,53265021,,142200795,53265021,4288657


In [33]:
# Probably a more effecient way to do this
project_lookup = lambda x: get_todoist_project_name(x)

In [34]:
past_tasks['project_name'] = past_tasks['project_id'].apply(project_lookup)

In [35]:
len(past_tasks.project_name.unique())

41

In [36]:
len(past_tasks)

3869

In [39]:
# Add Day of Week Completed
past_tasks['completed_date'] = pd.to_datetime(past_tasks['completed_date'])
past_tasks['dow'] = past_tasks['completed_date'].dt.weekday
past_tasks['day_of_week'] = past_tasks['completed_date'].dt.weekday_name

In [40]:
past_tasks.tail()

Unnamed: 0,completed_date,content,id,meta_data,project_id,task_id,user_id,project_name,dow,day_of_week
3864,2016-08-28 12:07:13,Read Checklist from MASTER THE GAME,53281331,,178797715,53281331,4288657,Studies: General,6,Sunday
3865,2016-08-28 10:40:51,Weekly Review,53277061,,142200795,53277061,4288657,Personal,6,Sunday
3866,2016-08-28 10:24:01,Financial Reflection Writing,53275224,,142200795,53275224,4288657,Personal,6,Sunday
3867,2016-08-28 10:17:51,Study Todoist Shortcuts,53273289,,142200795,53273289,4288657,Personal,6,Sunday
3868,2016-08-28 06:39:21,Review & Setup Tasks on TODOIST,53265021,,142200795,53265021,4288657,Personal,6,Sunday


In [41]:
# save to CSV
past_tasks.to_csv("data/todost-tasks-completed.csv", index=False)

------

## Get and Export Current Tasks List

In [42]:
available_task_items = api.state['items']

In [43]:
len(available_task_items)

1409

In [44]:
# Check a Sample Item
available_task_items[150]['project_id']

182358694

In [45]:
# Hackish Solution, Needs Improvement
with open('data/current-tasks-raw.csv', 'w') as file:
    file.write("id,content,checked,date_string,project_id,date_added,due_date_utc,date_completed \n")
    for i in list(range(0, len(available_task_items))): 
        if (available_task_items[i]['checked'] == 0):
            id = available_task_items[i]['id']
            content = available_task_items[i]['content']
            checked = available_task_items[i]['checked']
            date_string = available_task_items[i]['date_string']
            date_added = available_task_items[i]['date_added']
            project_id = available_task_items[i]['project_id']
            due_date_utc = available_task_items[i]['due_date_utc']
            date_completed = available_task_items[i]['date_completed']
            # print("id + "," + str(due_date_utc))
            file.write(str(id) 
                       + "," + '\"' + content + '\"' + "," 
                       + str(checked) + "," 
                       # + str(date_string) + "," 
                       + str(date_added)  + "," 
                       + str(project_id)  + "," 
                       + str(due_date_utc)  + "," 
                       + str(date_completed)                   
                       + "\n")

In [46]:
# df of current tasks
currents_task = pd.read_csv('data/current-tasks-raw.csv')

In [47]:
# total of current tasks (i.e. not archived or completed)
len(currents_task)

988

In [56]:
currents_task.tail()

Unnamed: 0,id,content,checked,date_string,project_id,date_added,due_date_utc,date_completed,project_name
983,2658526051,Python Data Analysis and Code for Apple Health,0,Sun 20 May 2018 13:39:23 +0000,2165379308,,,,Data-Driven You
984,2659830780,Complete Exercises 95 to 100,0,Mon 21 May 2018 14:27:57 +0000,181654076,Thu 24 May 2018 09:00:00 +0000,,,Code Studies
985,2661018388,SSL Certificate Review,0,Tue 22 May 2018 09:18:54 +0000,178885806,Thu 24 May 2018 06:30:00 +0000,,,RTBookReviews
986,2661364491,Pandas: SF Salaries Exercise,0,Tue 22 May 2018 13:28:43 +0000,181654076,,,,Code Studies
987,2662234986,Altered Traits,0,Wed 23 May 2018 01:52:52 +0000,2158267786,,,,Health


In [49]:
currents_task.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 988 entries, 0 to 987
Data columns (total 8 columns):
id                 988 non-null int64
content            988 non-null object
checked            988 non-null int64
date_string        988 non-null object
project_id         988 non-null int64
date_added         988 non-null object
due_date_utc       988 non-null object
date_completed     0 non-null float64
dtypes: float64(1), int64(3), object(4)
memory usage: 61.8+ KB


In [50]:
currents_task['project_name'] = currents_task['project_id'].apply(project_lookup)

In [59]:
# Date Added Cleanup
currents_task['date_added'] = currents_task['date_added'].replace(to_replace="None", value='')

In [61]:
# Add Day of Week Added
currents_task['date_added'] = pd.to_datetime(currents_task['date_added'])
currents_task['dow_added'] = currents_task['date_added'].dt.weekday
currents_task['day_of_week_added'] = currents_task['date_added'].dt.weekday_name

In [63]:
# currents_task.tail()

In [64]:
currents_task.to_csv('data/current-tasks.csv', index=False)

---

# TODO: Simple Data Analysis

In [53]:
# UNCOMMMENT TO VIEW: Report by Projects
# past_tasks.groupby(['project_name']).count()
# currents_task.groupby(['project_name']).count()

In [54]:
# ax = tasks.groupby(['project_name']).count().plot(kind='bar')
#plt.suptitle('Tasks Completed of Top Projects', fontsize=16)
#plt.xlabel('Projects', fontsize=12, color='red')