# Todoist Completed Tasks Downloader

This project will collect and aggregate all of your completed task data from Todoist. 

For a simple data analysis of your completed tasks, see [todoist_data_analysis.ipynb](https://github.com/markwk/qs_ledger/blob/master/todoist/todoist_data_analysis.ipynb). 

-------

## Installation and Setup

#### Download and Install Todoist Python Library

`$ pip install todoist-python`

#### Signup and Create a Todoist App

* Go to https://todoist.com/prefs/integrations and find you `API Token`
 at the very bottom of the page
* Copy sample-credentials.json and create credentials.json
* Add and Save your `API Token` to credentials.json


-----

## Dependencies

In [1]:
from todoist.api import TodoistAPI
import numpy as np, string, re, pytz
import pandas as pd
from datetime import datetime

### Credentials and Authentification

In [2]:
import json

with open("credentials.json", "r") as file:
    credentials = json.load(file)
    todoist_cr = credentials['todoist']
    TOKEN = todoist_cr['TOKEN']

In [3]:
api = TodoistAPI(TOKEN)
#api.sync() # uncomment to use

----------

## Check Basic User Info

In [4]:
user = api.state['user']

In [5]:
# user

In [6]:
# user['full_name']

-------

# List and Export of Current Projects

### API Call: api.state['projects']

https://developer.todoist.com/sync/v7/#get-all-projects

NOTE: This only gets info on your existing projects and exludes archived projects. 

In [7]:
user_projects  = api.state['projects']

In [8]:
# user_projects

In [9]:
len(user_projects)

24

In [10]:
with open('data/todoist-projects.csv', 'w') as file:
    file.write("Id" + "," + "Project" + "\n")
    for i in range(0, len(user_projects)):
        file.write('\"' + str(user_projects[i]['id']) + '\"' + "," + '\"' + str(user_projects[i]['name']) + '\"' + "\n")

In [11]:
projects = pd.read_csv("data/todoist-projects.csv")

In [12]:
# projects

-----

## User Completed Tasks Stats Info

API Call: `api.completed.get_stats()` https://developer.todoist.com/sync/v7/#get-productivity-stats

In [13]:
stats = api.completed.get_stats()

In [14]:
# total completed tasks from stats
user_completed_stats = stats['completed_count']
user_completed_stats

5311

-------

# Collect Raw List of All Completed Items from Todoist

### API Call: api.completed.get_all() 

https://developer.todoist.com/sync/v7/#get-all-completed-items

In [15]:
def get_completed_todoist_items():
    # create df from initial 50 completed tasks
    print("Collecting Initial 50 Completed Todoist Tasks...")
    temp_tasks_dict = (api.completed.get_all(limit=50))
    past_tasks = pd.DataFrame.from_dict(temp_tasks_dict['items'])
    # get the remaining items
    pager = list(range(50,user_completed_stats,50))
    for count, item in enumerate(pager):
        tmp_tasks = (api.completed.get_all(limit=50, offset=item))
        tmp_tasks_df = pd.DataFrame.from_dict(tmp_tasks['items'])
        past_tasks = pd.concat([past_tasks, tmp_tasks_df], sort=False)
        print("Collecting Additional Todoist Tasks " + str(item) + " of " + str(user_completed_stats))
    # save to CSV
    print("...Generating CSV Export")
    past_tasks.to_csv("data/todost-raw-tasks-completed.csv", index=False)

In [16]:
get_completed_todoist_items()

Collecting Initial 50 Completed Todoist Tasks...
Collecting Additional Todoist Tasks 50 of 5311
Collecting Additional Todoist Tasks 100 of 5311
Collecting Additional Todoist Tasks 150 of 5311
Collecting Additional Todoist Tasks 200 of 5311
Collecting Additional Todoist Tasks 250 of 5311
Collecting Additional Todoist Tasks 300 of 5311
Collecting Additional Todoist Tasks 350 of 5311
Collecting Additional Todoist Tasks 400 of 5311
Collecting Additional Todoist Tasks 450 of 5311
Collecting Additional Todoist Tasks 500 of 5311
Collecting Additional Todoist Tasks 550 of 5311
Collecting Additional Todoist Tasks 600 of 5311
Collecting Additional Todoist Tasks 650 of 5311
Collecting Additional Todoist Tasks 700 of 5311
Collecting Additional Todoist Tasks 750 of 5311
Collecting Additional Todoist Tasks 800 of 5311
Collecting Additional Todoist Tasks 850 of 5311
Collecting Additional Todoist Tasks 900 of 5311
Collecting Additional Todoist Tasks 950 of 5311
Collecting Additional Todoist Tasks 1000

In [17]:
past_tasks = pd.read_csv("data/todost-raw-tasks-completed.csv")

In [18]:
# past_tasks.head()

In [19]:
# generated count 
collected_total = len(past_tasks)
collected_total

5342

In [20]:
len(past_tasks.drop_duplicates())

5342

In [21]:
past_tasks['project_id'] = past_tasks.project_id.astype('category')

In [22]:
past_tasks.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5342 entries, 0 to 5341
Data columns (total 10 columns):
content              5340 non-null object
meta_data            0 non-null float64
user_id              5342 non-null int64
task_id              5342 non-null int64
project_id           5342 non-null category
completed_date       5342 non-null object
id                   5342 non-null int64
legacy_project_id    5032 non-null float64
legacy_task_id       1904 non-null float64
legacy_id            1728 non-null float64
dtypes: category(1), float64(4), int64(3), object(2)
memory usage: 382.6+ KB


In [23]:
len(past_tasks.project_id.unique())

45

---------

## Get All Current and Previous Projects

In [24]:
# Extract all project ids used on tasks
project_ids = past_tasks.project_id.unique()
# project_ids

In [25]:
# total all-time projects
len(project_ids)

45

In [26]:
# get project info from Todoist API
def get_todoist_project_name(project_id):
    item = api.projects.get_by_id(project_id)
    if item: 
        try:
            return item['name']
        except:
            return item['project']['name']

In [27]:
# Testing with a Sample Archived Project
# get_todoist_project_name(183682060)

In [28]:
# Testing with a Sample Current Project
# get_todoist_project_name(1252539618)

In [29]:
# Get Info on All User Projects
project_names = []
for i in project_ids:
    project_names.append(get_todoist_project_name(i))

In [30]:
# project_names

-----

## Match Project Id Name on Completed Tasks, Add Day of Week

In [31]:
# past_tasks.tail()

In [32]:
# Probably a more effecient way to do this
project_lookup = lambda x: get_todoist_project_name(x)

In [33]:
past_tasks['project_name'] = past_tasks['project_id'].apply(project_lookup) # note: not very efficient

In [34]:
len(past_tasks.project_name.unique())

36

In [35]:
# functions to convert UTC to Shanghai time zone and extract date/time elements
convert_tz = lambda x: x.to_pydatetime().replace(tzinfo=pytz.utc).astimezone(pytz.timezone('Asia/Shanghai'))
get_year = lambda x: convert_tz(x).year
get_month = lambda x: '{}-{:02}'.format(convert_tz(x).year, convert_tz(x).month) #inefficient
get_date = lambda x: '{}-{:02}-{:02}'.format(convert_tz(x).year, convert_tz(x).month, convert_tz(x).day) #inefficient
get_day = lambda x: convert_tz(x).day
get_hour = lambda x: convert_tz(x).hour
get_day_of_week = lambda x: convert_tz(x).weekday()

In [36]:
# parse out date and time elements as Shanghai time
past_tasks['completed_date'] = pd.to_datetime(past_tasks['completed_date'])
past_tasks['year'] = past_tasks['completed_date'].map(get_year)
past_tasks['month'] = past_tasks['completed_date'].map(get_month)
past_tasks['date'] = past_tasks['completed_date'].map(get_date)
past_tasks['day'] = past_tasks['completed_date'].map(get_day)
past_tasks['hour'] = past_tasks['completed_date'].map(get_hour)
past_tasks['dow'] = past_tasks['completed_date'].map(get_day_of_week)
past_tasks = past_tasks.drop(labels=['completed_date'], axis=1)

In [37]:
# past_tasks.head()

In [38]:
# save to CSV
past_tasks.to_csv("data/todost-tasks-completed.csv", index=False)