Skip to content

Commit

Permalink
Import tasks via CSV (#51)
Browse files Browse the repository at this point in the history
* Bare start on CSV support

* Move core of CSV importer to operations

* More validations, break out validation function

* Validate dates and TaskList; convert errors to list of dictionaries

* Finish upsert code, and documentation

* Print msgs from the mgmt command, not the operations module

* Handle BOM marks

* Handle both in-memory and local file objects

* Update readme

* Working browser-upload view

* Bail on incorrect headers

* Fix default values and finish example spreadsheet

* Change column order, update docs

* Update index.md for RTD

* First round of responses to PR feedback

* Restore independent summaries/errors/upserts properties

* PR responses

* Split off reusable date validator into separate function

* Fix URLs append

* General test suite for CSV importer
  • Loading branch information
shacker committed Mar 26, 2019
1 parent 184084c commit 4a99d90
Show file tree
Hide file tree
Showing 15 changed files with 599 additions and 15 deletions.
77 changes: 72 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ assignment application for Django, designed to be dropped into an existing site
* Public-facing submission form for tickets
* Mobile-friendly (work in progress)
* Separate view for My Tasks (across lists)
* Batch-import tasks via CSV


## Requirements
Expand Down Expand Up @@ -45,14 +46,13 @@ All tasks are "created by" the current user and can optionally be "assigned to"

django-todo v2 makes use of features only available in Django 2.0. It will not work in previous versions. v2 is only tested against Python 3.x -- no guarantees if running it against older versions.

# Installation
## Installation

django-todo is a Django app, not a project site. It needs a site to live in. You can either install it into an existing Django project site, or clone the django-todo [demo site (GTD)](https://github.com/shacker/gtd).

If using your own site, be sure you have jQuery and Bootstrap wired up and working.

django-todo pages that require it will insert additional CSS/JavaScript into page heads,
so your project's base templates must include:
django-todo views that require it will insert additional CSS/JavaScript into page heads, so your project's base templates must include:

```jinja
{% block extrahead %}{% endblock extrahead %}
Expand Down Expand Up @@ -100,13 +100,17 @@ django-todo makes use of the Django `messages` system. Make sure you have someth

Log in and access `/todo`!

### Customizing Templates

The provided templates are fairly bare-bones, and are meant as starting points only. Unlike previous versions of django-todo, they now ship as Bootstrap examples, but feel free to override them - there is no hard dependency on Bootstrap. To override a template, create a `todo` folder in your project's `templates` dir, then copy the template you want to override from django-todo source and into that dir.

### Filing Public Tickets

If you wish to use the public ticket-filing system, first create the list into which those tickets should be filed, then add its slug to `TODO_DEFAULT_LIST_SLUG` in settings (more on settings below).

## Settings

Optional configuration options:
Optional configuration params, which can be added to your project settings:

```python
# Restrict access to ALL todo lists/views to `is_staff` users.
Expand Down Expand Up @@ -141,6 +145,46 @@ The current django-todo version number is available from the [todo package](http

python -c "import todo; print(todo.__version__)"

## Importing Tasks via CSV

django-todo has the ability to batch-import ("upsert") tasks from a specifically formatted CSV spreadsheet. This ability is provided through both a management command and a web interface.

**Management Command**

`./manage.py import_csv -f /path/to/file.csv`

**Web Importer**

Link from your navigation to `{url "todo:import_csv"}`


### CSV Formatting

Copy `todo/data/import_example.csv` to another location on your system and edit in a spreadsheet or directly.

**Do not edit the header row!**

The first four columns: `'Title', 'Group', 'Task List', 'Created By'` are required -- all others are optional and should work pretty much exactly like manual task entry via the web UI.

Note: Internally, Tasks are keyed to TaskLists, not to Groups (TaskLists are in Gruops). However, we request the Group in the CSV
because it's possible to have multiple TaskLists with the same name in different groups; i.e. we need it for namespacing and permissions.


### Import Rules

Because data entered via CSV is not going through the same view permissions enforced in the rest of django-todo, and to simplify data dependency logic, and to pre-empt disagreements between django-todo users, the importer will *not* create new users, groups, or task lists. All users, groups, and task lists referenced in your CSV must already exist, and group memberships must be correct.

Any validation error (e.g. unparse-able dates, incorrect group memberships) **will result in that row being skipped.**

A report of rows upserted and rows skipped (with line numbers and reasons) is provided at the end of the run.

### Upsert Logic

For each valid row, we need to decide whether to create a new task or update an existing one. django-todo matches on the unique combination of the four required columns. If we find a task that matches those, we *update* the rest of the columns. In other words, if you import a CSV once, then edit the Assigned To for a task and import it again, the original task will be updated with a new assignee (and same for the other columns).

Otherwise we create a new task.


## Mail Tracking

What if you could turn django-todo into a shared mailbox? Django-todo includes an optional feature that allows emails
Expand Down Expand Up @@ -208,6 +252,8 @@ A mail worker can be started with:
./manage.py mail_worker test_tracker
```

Some views and URLs were renamed in 2.0 for logical consistency. If this affects you, see source code and the demo GTD site for reference to the new URL names.

If you want to log mail events, make sure to properly configure django logging:

```python
Expand Down Expand Up @@ -240,7 +286,28 @@ django-todo uses pytest exclusively for testing. The best way to run the suite i

The previous `tox` system was removed with the v2 release, since we no longer aim to support older Python or Django versions.

# Version History

## Upgrade Notes

django-todo 2.0 was rebuilt almost from the ground up, and included some radical changes, including model name changes. As a result, it is *not compatible* with data from django-todo 1.x. If you would like to upgrade an existing installation, try this:

* Use `./manage.py dumpdata todo --indent 4 > todo.json` to export your old todo data
* Edit the dump file, replacing the old model names `Item` and `List` with the new model names (`Task` and `TaskList`)
* Delete your existing todo data
* Uninstall the old todo app and reinstall
* Migrate, then use `./manage.py loaddata todo.json` to import the edited data

### Why not provide migrations?

That was the plan, but unfortunately, `makemigrations` created new tables and dropped the old ones, making this a destructive update. Renaming models is unfortunately not something `makemigrations` can do, and I really didn't want to keep the badly named original models. Sorry!

### Datepicker

django-todo no longer references a jQuery datepicker, but defaults to native html5 browser datepicker (not supported by Safari, unforunately). Feel free to implement one of your choosing.

## Version History

**2.4.0** Added ability to batch-import tasks via CSV

**2.3.0** Implement mail tracking system

Expand Down
52 changes: 48 additions & 4 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ assignment application for Django, designed to be dropped into an existing site
* Public-facing submission form for tickets
* Mobile-friendly (work in progress)
* Separate view for My Tasks (across lists)
* Batch-import tasks via CSV


## Requirements
Expand Down Expand Up @@ -44,7 +45,7 @@ All tasks are "created by" the current user and can optionally be "assigned to"

django-todo v2 makes use of features only available in Django 2.0. It will not work in previous versions. v2 is only tested against Python 3.x -- no guarantees if running it against older versions.

# Installation
## Installation

django-todo is a Django app, not a project site. It needs a site to live in. You can either install it into an existing Django project site, or clone the django-todo [demo site (GTD)](https://github.com/shacker/gtd).

Expand Down Expand Up @@ -132,7 +133,6 @@ The current django-todo version number is available from the [todo package](http

python -c "import todo; print(todo.__version__)"


## Upgrade Notes

django-todo 2.0 was rebuilt almost from the ground up, and included some radical changes, including model name changes. As a result, it is *not compatible* with data from django-todo 1.x. If you would like to upgrade an existing installation, try this:
Expand All @@ -153,7 +153,7 @@ django-todo no longer references a jQuery datepicker, but defaults to native htm

### URLs

Some views and URLs were renamed for logical consistency. If this affects you, see source code and the demo GTD site for reference to the new URL names.
Some views and URLs were renamed in 2.0 for logical consistency. If this affects you, see source code and the demo GTD site for reference to the new URL names.


## Running Tests
Expand All @@ -166,7 +166,49 @@ django-todo uses pytest exclusively for testing. The best way to run the suite i

The previous `tox` system was removed with the v2 release, since we no longer aim to support older Python or Django versions.

# Version History
## Importing Tasks via CSV

django-todo has the ability to batch-import ("upsert") tasks from a specifically formatted CSV spreadsheet. This ability is provided through both a management command and a web interface.

**Management Command**

`./manage.py import_csv -f /path/to/file.csv`

**Web Importer**

Link from your navigation to `{url "todo:import_csv"}`

### Import Rules

Because data entered via CSV is not going through the same view permissions enforced in the rest of django-todo, and to simplify data dependency logic, and to pre-empt disagreements between django-todo users, the importer will *not* create new users, groups, or task lists. All users, groups, and task lists referenced in your CSV must already exist, and group memberships must be correct.

Any validation error (e.g. unparse-able dates, incorrect group memberships) **will result in that row being skipped.**

A report of rows upserted and rows skipped (with line numbers and reasons) is provided at the end of the run.

### CSV Formatting

Copy `todo/data/import_example.csv` to another location on your system and edit in a spreadsheet or directly.

**Do not edit the header row!**

The first four columns: `'Title', 'Group', 'Task List', 'Created By'` are required -- all others are optional and should work pretty much exactly like manual task entry via the web UI.

Note: Internally, Tasks are keyed to TaskLists, not to Groups (TaskLists are in Gruops). However, we request the Group in the CSV
because it's possible to have multiple TaskLists with the same name in different groups; i.e. we need it for namespacing and permissions.

### Upsert Logic

For each valid row, we need to decide whether to create a new task or update an existing one. django-todo matches on the unique combination of the four required columns. If we find a task that matches those, we *update* the rest of the columns. In other words, if you import a CSV once, then edit the Assigned To for a task and import it again, the original task will be updated with a new assignee (and same for the other columns).

Otherwise we create a new task.


## Version History

**2.3.0** Added ability to batch-import tasks via CSV

**2.2.1** Convert task delete and toggle_done views to POST only

**2.2.0** Re-instate enforcement of TODO_STAFF_ONLY setting

Expand Down Expand Up @@ -225,3 +267,5 @@ ALL groups, not just the groups they "belong" to)
**0.9.1** - Removed context_processors.py - leftover turdlet

**0.9** - First release


2 changes: 1 addition & 1 deletion test_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@
},
'django': {
'handlers': ['console'],
'level': 'DEBUG',
'level': 'WARNING',
'propagate': True,
},
'django.request': {
Expand Down
4 changes: 4 additions & 0 deletions todo/data/import_example.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Title,Group,Task List,Created By,Created Date,Due Date,Completed,Assigned To,Note,Priority
Make dinner,Scuba Divers,Web project,shacker,,2019-06-14,No,,Please check with mgmt first,3
Bake bread,Scuba Divers,Example List,mr_random,2012-03-14,,Yes,,,
Bring dessert,Scuba Divers,Web project,user1,2015-06-248,,,user1,Every generation throws a hero up the pop charts,77
57 changes: 57 additions & 0 deletions todo/management/commands/import_csv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
import sys
from typing import Any
from pathlib import Path

from django.core.management.base import BaseCommand, CommandParser

from todo.operations.csv_importer import CSVImporter


class Command(BaseCommand):
help = """Import specifically formatted CSV file containing incoming tasks to be loaded.
For specfic format of inbound CSV, see data/import_example.csv.
For documentation on upsert logic and required fields, see README.md.
"""

def add_arguments(self, parser: CommandParser) -> None:

parser.add_argument(
"-f", "--file", dest="file", default=None, help="File to to inbound CSV file."
)

def handle(self, *args: Any, **options: Any) -> None:
# Need a file to proceed
if not options.get("file"):
print("Sorry, we need a filename to work from.")
sys.exit(1)

filepath = Path(options["file"])

if not filepath.exists():
print(f"Sorry, couldn't find file: {filepath}")
sys.exit(1)

# Encoding "utf-8-sig" means "ignore byte order mark (BOM), which Excel inserts when saving CSVs."
with filepath.open(mode="r", encoding="utf-8-sig") as fileobj:
importer = CSVImporter()
results = importer.upsert(fileobj, as_string_obj=True)

# Report successes, failures and summaries
print()
if results["upserts"]:
for upsert_msg in results["upserts"]:
print(upsert_msg)

# Stored errors has the form:
# self.errors = [{3: ["Incorrect foo", "Non-existent bar"]}, {7: [...]}]
if results["errors"]:
for error_dict in results["errors"]:
for k, error_list in error_dict.items():
print(f"\nSkipped CSV row {k}:")
for msg in error_list:
print(f"- {msg}")

print()
if results["summaries"]:
for summary_msg in results["summaries"]:
print(summary_msg)
22 changes: 22 additions & 0 deletions todo/migrations/0009_priority_optional.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Generated by Django 2.1.7 on 2019-03-18 23:14

from django.db import migrations, models


class Migration(migrations.Migration):

dependencies = [
('todo', '0008_mail_tracker'),
]

operations = [
migrations.AlterModelOptions(
name='task',
options={'ordering': ['priority', 'created_date']},
),
migrations.AlterField(
model_name='task',
name='priority',
field=models.PositiveIntegerField(blank=True, null=True),
),
]
4 changes: 2 additions & 2 deletions todo/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ class Task(models.Model):
on_delete=models.CASCADE,
)
note = models.TextField(blank=True, null=True)
priority = models.PositiveIntegerField()
priority = models.PositiveIntegerField(blank=True, null=True)

# Has due date for an instance of this object passed?
def overdue_status(self):
Expand Down Expand Up @@ -115,7 +115,7 @@ def merge_into(self, merge_target):
self.delete()

class Meta:
ordering = ["priority"]
ordering = ["priority", "created_date"]


class Comment(models.Model):
Expand Down
Empty file added todo/operations/__init__.py
Empty file.

0 comments on commit 4a99d90

Please sign in to comment.