Skip to content

Commit

Permalink
added skip_html_diff parameter (django-import-export#1329)
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewhegarty committed Dec 6, 2021
1 parent fd65990 commit 3dede05
Show file tree
Hide file tree
Showing 9 changed files with 179 additions and 54 deletions.
1 change: 1 addition & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Changelog
---------

- Big integer support for Integer widget (#788)
- Added `skip_html_diff` meta attribute (#1329)
- Run compilemessages command to keep .mo files in sync (#1299)
- Add ability to rollback the import on validation error (#1339)
- Fix missing migration on example app (#1346)
Expand Down
21 changes: 19 additions & 2 deletions import_export/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ class ResourceOptions:

report_skipped = True
"""
Controls if the result reports skipped rows Default value is True
Controls if the result reports skipped rows. Default value is True
"""

clean_model_instances = False
Expand All @@ -133,6 +133,23 @@ class ResourceOptions:
stored in each ``RowResult``.
If diffing is not required, then disabling the diff operation by setting this value to ``True``
improves performance, because the copy and comparison operations are skipped for each row.
If enabled, then ``skip_row()`` checks do not execute, because 'skip' logic requires
comparison between the stored and imported versions of a row.
If enabled, then HTML row reports are also not generated (see ``skip_html_diff``).
The default value is False.
"""

skip_html_diff = False
"""
Controls whether or not a HTML report is generated after each row.
By default, the difference between a stored copy and an imported instance
is generated in HTML form and stored in each ``RowResult``.
The HTML report is used to present changes on the confirmation screen in the admin site,
hence when this value is ``True``, then changes will not be presented on the confirmation
screen.
If the HTML report is not required, then setting this value to ``True`` improves performance,
because the HTML generation is skipped for each row.
This is a useful optimization when importing large datasets.
The default value is False.
"""

Expand Down Expand Up @@ -687,7 +704,7 @@ def import_row(self, row, instance_loader, using_transactions=True, dry_run=Fals
if not skip_diff:
diff.compare_with(self, instance, dry_run)

if not skip_diff:
if not skip_diff and not self._meta.skip_html_diff:
row_result.diff = diff.as_html()
self.after_import_row(row, row_result, **kwargs)

Expand Down
2 changes: 2 additions & 0 deletions requirements/test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@ mysqlclient
coveralls
chardet
pytz
memory-profiler
django-extensions
71 changes: 59 additions & 12 deletions tests/bulk/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
## Bulk import testing

This scripts outlines the steps used to profile bulk loading.
The `bulk_import.py` script is used to profile run duration and memory during bulk load testing.

### Pre-requisites

- [Docker](https://docker.com)
- [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/command_ref.html)

### Test environment

The following tests were run on the following platform:

- Thinkpad T470 i5 processor (Ubuntu 18.04)
- python 3.8.1
- Postgres 10 (docker container)
Expand All @@ -8,7 +20,9 @@

```bash
# create venv and install django-import-export dependencies
pip install memory-profiler
cd <PROJECT_HOME>
mkvirtualenv -p python3 djangoimportexport
pip install -r requirements/base.txt -r requirements/test.txt
```

### Create Postgres DB
Expand All @@ -19,28 +33,31 @@ export IMPORT_EXPORT_POSTGRESQL_USER=pguser
export IMPORT_EXPORT_POSTGRESQL_PASSWORD=pguserpass
export DJANGO_SETTINGS_MODULE=settings

cd ~/Projects/django-import-export/tests/bulk
cd <PROJECT_HOME>/tests

# start a local postgres instance
docker-compose up -d db
docker-compose -f bulk/docker-compose.yml up -d

cd ..
./manage.py migrate
./manage.py test

# only required if you want to login to the Admin site
./manage.py createsuperuser --username admin --email=email@example.com
```

### bulk_create
### Running script

```bash
python3 import_book.py
# run creates, updates, and deletes
./manage.py runscript bulk_import

# pass 'create', 'update' or 'delete' to run the single test
./manage.py runscript bulk_import --script-args create
```

### Results

- 20k book entries
- Used 20k book entries
- Memory is reported as the peak memory value whilst running the test script

#### bulk_create
Expand Down Expand Up @@ -78,6 +95,10 @@ def get_or_init_instance(self, instance_loader, row):

#### bulk_update

```bash
./manage.py runscript bulk_import --script-args update
```

##### Default settings

- `skip_diff = False`
Expand All @@ -101,8 +122,21 @@ def get_or_init_instance(self, instance_loader, row):
| `use_bulk=True, batch_size=1000` | 21.56 | 21.25 |


- `skip_diff = False`

| Condition | Time (secs) | Memory (MB) |
| ---------------------------------------- | ----------- | ------------- |
| `use_bulk=True, batch_size=1000` | 9.26 | 8.51 |
| `skip_html_diff=True, batch_size=1000` | 8.69 | 7.50 |
| `skip_unchanged=True, batch_size=1000` | 5.42 | 7.34 |


#### bulk delete

```bash
./manage.py runscript bulk_import --script-args delete
```

##### Default settings

- `skip_diff = False`
Expand All @@ -125,15 +159,28 @@ def get_or_init_instance(self, instance_loader, row):
| `use_bulk=True, batch_size=None` | 14.08 | 39.40 |
| `use_bulk=True, batch_size=1000` | 15.37 | 32.70 |

### Clearing DB
### Checking DB

Note that the db is cleared down after each test run.
You need to uncomment the `delete()` calls to be able to view data.

```bash
./manage.py shell_plus

Book.objects.all().count()
```

### Clear down

(set environment variables before running)
Optional clear down of resources:

```bash
./manage.py shell
# remove the test db container
docker-compose -f bulk/docker-compose.yml down -v

from core.models import Book
Book.objects.all().delete()
# remove venv
deactivate
rmvirtualenv djangoimportexport
```

### References
Expand Down
5 changes: 3 additions & 2 deletions tests/bulk/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@ services:
DB_HOST: 'db'
DB_PORT: '5432'
DB_NAME: 'import_export'
IMPORT_EXPORT_POSTGRESQL_USER: 'pguser'
IMPORT_EXPORT_POSTGRESQL_PASSWORD: 'pguserpass'
IMPORT_EXPORT_POSTGRESQL_USER: ${IMPORT_EXPORT_POSTGRESQL_USER}
IMPORT_EXPORT_POSTGRESQL_PASSWORD: ${IMPORT_EXPORT_POSTGRESQL_PASSWORD}
POSTGRES_PASSWORD: ${IMPORT_EXPORT_POSTGRESQL_PASSWORD}
image: postgres:10
restart: "no"
ports:
Expand Down
18 changes: 18 additions & 0 deletions tests/core/tests/test_resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -1496,6 +1496,24 @@ class Meta:
self.assertEqual(2, mock_get_import_fields.call_count)


class SkipHtmlDiffTest(TestCase):

def test_skip_html_diff(self):
class BookResource(resources.ModelResource):

class Meta:
model = Book
skip_html_diff = True

resource = BookResource()
self.dataset = tablib.Dataset(headers=['id', 'name', 'birthday'])
self.dataset.append(['', 'A.A.Milne', '1882test-01-18'])

with mock.patch('import_export.resources.Diff.as_html') as mock_as_html:
resource.import_data(self.dataset, dry_run=True)
mock_as_html.assert_not_called()


class BulkTest(TestCase):

def setUp(self):
Expand Down
Empty file added tests/scripts/__init__.py
Empty file.

0 comments on commit 3dede05

Please sign in to comment.