Skip to content

Commit

Permalink
Clarified 'skip_diff' documentation (#1655)
Browse files Browse the repository at this point in the history
* updated docs to clarify skip behaviour

* updated changelog

* corrected docs

* doc update
  • Loading branch information
matthewhegarty committed Oct 18, 2023
1 parent 0e5ee68 commit 46ec89d
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 24 deletions.
15 changes: 9 additions & 6 deletions docs/advanced_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -445,16 +445,17 @@ Handling duplicate data

If an existing instance is identified during import, then the existing instance will be updated, regardless of whether
the data in the import row is the same as the persisted data or not. You can configure the import process to skip the
row if it is duplicate by using setting ``skip_unchanged``.
row if it is duplicate by using setting :attr:`~import_export.resources.ResourceOptions.skip_unchanged`.

If ``skip_unchanged`` is enabled, then the import process will check each defined import field and perform a simple
comparison with the existing instance, and if all comparisons are equal, then the row is skipped. Skipped rows are
recorded in the row ``Result`` object.
If :attr:`~import_export.resources.ResourceOptions.skip_unchanged` is enabled, then the import process will check each
defined import field and perform a simple comparison with the existing instance, and if all comparisons are equal, then
the row is skipped. Skipped rows are recorded in the row :class:`~import_export.results.RowResult` object.

You can override the :meth:`~.skip_row` method to have full control over the skip row implementation.

Also, the ``report_skipped`` option controls whether skipped records appear in the import
``Result`` object, and whether skipped records will show in the import preview page in the Admin UI::
Also, the :attr:`~import_export.resources.ResourceOptions.report_skipped` option controls whether skipped records appear
in the import :class:`~import_export.results.RowResult` object, and whether skipped records will show in the import
preview page in the Admin UI::

class BookResource(resources.ModelResource):

Expand Down Expand Up @@ -613,6 +614,8 @@ Note that to use the :class:`~import_export.admin.ExportMixin` or
:class:`~import_export.admin.ExportActionMixin`, you must declare this mixin
**before** ``admin.ModelAdmin``.

.. _import-process:

Importing
---------

Expand Down
1 change: 1 addition & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Please refer to :doc:`release notes<release_notes>`.
- added specific check for missing ``import_id_fields`` (#1645)
- Enable optional tablib dependencies (#1647)
- Removed unused method ``utils.original()``
- Clarified ``skip_diff`` documentation (#1655)

4.0.0-alpha.5 (2023-09-22)
--------------------------
Expand Down
63 changes: 45 additions & 18 deletions import_export/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,20 +110,38 @@ class ResourceOptions:

skip_unchanged = False
"""
Controls if the import should skip unchanged records. Default value is
False
Controls if the import should skip unchanged records.
If ``True``, then each existing instance is compared with the instance to be
imported, and if there are no changes detected, the row is recorded as skipped,
and no database update takes place.
The advantages of enabling this option are:
#. Avoids unnecessary database operations which can result in performance
improvements for large datasets.
#. Skipped records are recorded in each :class:`~import_export.results.RowResult`.
#. Skipped records are clearly visible in the
:ref:`import confirmation page<import-process>`.
For the default ``skip_unchanged`` logic to work, the
:attr:`~import_export.resources.ResourceOptions.skip_diff` must also be ``False``
(which is the default):
Default value is ``False``.
"""

report_skipped = True
"""
Controls if the result reports skipped rows. Default value is True
Controls if the result reports skipped rows. Default value is ``True``.
"""

clean_model_instances = False
"""
Controls whether ``instance.full_clean()`` is called during the import
process to identify potential validation errors for each (non skipped) row.
The default value is False.
The default value is ``False``.
"""

chunk_size = None
Expand All @@ -135,16 +153,22 @@ class ResourceOptions:
skip_diff = False
"""
Controls whether or not an instance should be diffed following import.
By default, an instance is copied prior to insert, update or delete.
After each row is processed, the instance's copy is diffed against the original,
and the value stored in each :class:`~import_export.results.RowResult`.
If diffing is not required, then disabling the diff operation by setting this value
to ``True`` improves performance, because the copy and comparison operations are
skipped for each row.
If enabled, then ``skip_row()`` checks do not execute, because 'skip' logic requires
comparison between the stored and imported versions of a row.
If enabled, then HTML row reports are also not generated (see ``skip_html_diff``).
The default value is False.
If enabled, then :meth:`~import_export.resources.Resource.skip_row` checks do not
execute, because 'skip' logic requires comparison between the stored and imported
versions of a row.
If enabled, then HTML row reports are also not generated, meaning that the
:attr:`~import_export.resources.ResourceOptions.skip_html_diff` value is ignored.
The default value is ``False``.
"""

skip_html_diff = False
Expand All @@ -153,13 +177,16 @@ class ResourceOptions:
By default, the difference between a stored copy and an imported instance
is generated in HTML form and stored in each
:class:`~import_export.results.RowResult`.
The HTML report is used to present changes on the confirmation screen in the admin
site, hence when this value is ``True``, then changes will not be presented on the
confirmation screen.
The HTML report is used to present changes in the
:ref:`import confirmation page<import-process>` in the admin site, hence when this
value is ``True``, then changes will not be presented on the confirmation screen.
If the HTML report is not required, then setting this value to ``True`` improves
performance, because the HTML generation is skipped for each row.
This is a useful optimization when importing large datasets.
The default value is False.
The default value is ``False``.
"""

use_bulk = False
Expand All @@ -175,13 +202,13 @@ class ResourceOptions:
The default is to create objects in batches of 1000.
See `bulk_create()
<https://docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create>`_.
This parameter is only used if ``use_bulk`` is True.
This parameter is only used if ``use_bulk`` is ``True``.
"""

force_init_instance = False
"""
If True, this parameter will prevent imports from checking the database for existing
instances.
If ``True``, this parameter will prevent imports from checking the database for
existing instances.
Enabling this parameter is a performance enhancement if your import dataset is
guaranteed to contain new instances.
"""
Expand All @@ -190,7 +217,7 @@ class ResourceOptions:
"""
DB Connection name to use for db transactions. If not provided,
``router.db_for_write(model)`` will be evaluated and if it's missing,
DEFAULT_DB_ALIAS constant ("default") is used.
``DEFAULT_DB_ALIAS`` constant ("default") is used.
"""

store_row_values = False
Expand All @@ -211,10 +238,10 @@ class ResourceOptions:

use_natural_foreign_keys = False
"""
If True, use_natural_foreign_keys = True will be passed to all foreign
If ``True``, this value will be passed to all foreign
key widget fields whose models support natural foreign keys. That is,
the model has a natural_key function and the manager has a
get_by_natural_key function.
``get_by_natural_key()`` function.
"""


Expand Down

0 comments on commit 46ec89d

Please sign in to comment.