Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to support latest escaping logic #1638

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ on:
pull_request:
branches:
- main
# temporary - remove after release 4
- release-4

jobs:
test:
Expand Down
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -48,5 +48,7 @@ install-docs-requirements: ## install requirements for docs

install-requirements: install-base-requirements install-test-requirements install-docs-requirements

build-html-doc: ## builds the project documentation in HTML format
## builds the project documentation in HTML format
## run `pip install -e .` first if running locally
build-html-doc:
DJANGO_SETTINGS_MODULE=tests.settings make html -C docs
48 changes: 36 additions & 12 deletions docs/advanced_usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -879,7 +879,7 @@ Security
--------

Enabling the Admin interface means that you should consider the security implications. Some or all of the following
points may be relevant:
points may be relevant.

Is there potential for untrusted imports?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -907,30 +907,54 @@ What is the potential risk for exported data?
* Could any exported data be rendered in HTML? For example, csv is exported and then loaded into another
web application. In this case, untrusted input could contain malicious code such as active script content.

You should in all cases review `Django security documentation <https://docs.djangoproject.com/en/stable/topics/security/>`_
before deploying a live Admin interface instance.

Mitigating security risks
^^^^^^^^^^^^^^^^^^^^^^^^^

Please read the following topics carefully to understand how you can improve the security of your implementation.

Sanitize exports
""""""""""""""""

By default, import-export does not sanitize or process imported data. Malicious content, such as script directives,
can be imported into the database, and can be exported without any modification.

You can optionally configure import-export to sanitize data on export. There are two settings which enable this:
.. note::

#. :ref:`IMPORT_EXPORT_ESCAPE_HTML_ON_EXPORT`
#. :ref:`IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT`
HTML content, if exported into 'html' format, will be sanitized to remove scriptable content.
This sanitization is performed by the ``tablib`` library.

.. warning::
You can optionally configure import-export to sanitize Excel formula data on export. See
:ref:`IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT`.

Enabling these settings only sanitizes data exported using the Admin Interface.
If exporting data :ref:`programmatically<exporting_data>`, then you will need to apply your own sanitization.
Enabling this setting only sanitizes data exported using the Admin Interface.
If exporting data :ref:`programmatically<exporting_data>`, then you will need to apply your own sanitization.

Limiting the available import or export types can be considered. This can be done using either of the following settings:
Limit formats
"""""""""""""

Limiting the available import or export format types can be considered. For example, if you never need to support
import or export of spreadsheet data, you can remove this format from the application.

Imports and exports can be restricted using the following settings:

#. :ref:`IMPORT_EXPORT_FORMATS`
#. :ref:`IMPORT_FORMATS`
#. :ref:`EXPORT_FORMATS`

You should in all cases review `Django security documentation <https://docs.djangoproject.com/en/dev/topics/security/>`_
before deploying a live Admin interface instance.
Set permissions
"""""""""""""""

Consider setting `permissions <https://docs.djangoproject.com/en/stable/topics/auth/default/>`_ to define which
users can import and export.

#. :ref:`IMPORT_EXPORT_IMPORT_PERMISSION_CODE`
#. :ref:`IMPORT_EXPORT_EXPORT_PERMISSION_CODE`

Raising security issues
^^^^^^^^^^^^^^^^^^^^^^^

Please refer to `SECURITY.md <https://github.com/django-import-export/django-import-export/blob/main/SECURITY.md>`_ for
details on how to escalate security issues.
Refer to `SECURITY.md <https://github.com/django-import-export/django-import-export/blob/main/SECURITY.md>`_ for
details on how to escalate security issues you may have found in import-export.
9 changes: 2 additions & 7 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,15 +1,10 @@
Changelog
=========

4.0.0-alpha.2 (2023-09-20)
4.0.0-alpha.5 (2023-09-22)
--------------------------

- doc updates only

4.0.0-alpha.1 (2023-09-20)
--------------------------

- doc updates only
- refactor to export HTML / formulae escaping updates (#1638)

4.0.0-alpha.0 (2023-09-20)
--------------------------
Expand Down
52 changes: 35 additions & 17 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,25 +76,45 @@ during imports. Defaults to ``import_export.tmp_storages.TempFolderStorage``.
Can be overridden on a ``ModelAdmin`` class inheriting from ``ImportMixin`` by
setting the ``tmp_storage_class`` class attribute.

.. _IMPORT_EXPORT_IMPORT_PERMISSION_CODE:

``IMPORT_EXPORT_IMPORT_PERMISSION_CODE``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If set, lists the permission code that is required for users to perform the
import action. Defaults to ``None``, which means everybody can perform
'import' action. Defaults to ``None``, which means all users can perform
imports.

Django’s built-in permissions have the codes ``add``, ``change``, ``delete``,
and ``view``. You can also add your own permissions.
and ``view``. You can also add your own permissions. For example, if you set this
value to 'import', then you can define an explicit permission for import in the example
app with:

.. code-block:: python

from core.models import Book
from django.contrib.auth.models import Permission
from django.contrib.contenttypes.models import ContentType

content_type = ContentType.objects.get_for_model(Book)
permission = Permission.objects.create(
codename="import_book",
name="Can import book",
content_type=content_type,
)

Now only users who are assigned 'import_book' permission will be able to perform
imports. For more information refer to the
`Django auth <https://docs.djangoproject.com/en/stable/topics/auth/default/>`_
documentation.

.. _IMPORT_EXPORT_EXPORT_PERMISSION_CODE:

``IMPORT_EXPORT_EXPORT_PERMISSION_CODE``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If set, lists the permission code that is required for users to perform the
“export” action. Defaults to ``None``, which means everybody can perform
exports.

Django’s built-in permissions have the codes ``add``, ``change``, ``delete``,
and ``view``. You can also add your own permissions.
Defines the same behaviour as :ref:`IMPORT_EXPORT_IMPORT_PERMISSION_CODE`, but for
export.

``IMPORT_EXPORT_CHUNK_SIZE``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -120,22 +140,18 @@ Note that if you disable transaction support via configuration (or if your datab
does not support transactions), then validation errors will still be presented to the user
but valid rows will have imported.

``IMPORT_EXPORT_ESCAPE_HTML_ON_EXPORT``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If set to ``True``, strings will be HTML escaped. By default this is ``False``.

This is deprecated and will be removed in a future release. Future releases will
escape strings by default.
.. _IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT:

``IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If set to ``True``, strings will be sanitized by removing any leading '=' character. This is to prevent execution of
Excel formulae. By default this is ``False``.

.. _IMPORT_EXPORT_FORMATS:

``IMPORT_EXPORT_FORMATS``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~

A list that defines which file formats will be allowed during imports and exports. Defaults
to ``import_export.formats.base_formats.DEFAULT_FORMATS``.
Expand All @@ -147,9 +163,10 @@ The values must be those provided in ``import_export.formats.base_formats`` e.g
from import_export.formats.base_formats import XLSX
IMPORT_EXPORT_FORMATS = [XLSX]

.. _IMPORT_FORMATS:

``IMPORT_FORMATS``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~

A list that defines which file formats will be allowed during imports. Defaults
to ``IMPORT_EXPORT_FORMATS``.
Expand All @@ -161,6 +178,7 @@ The values must be those provided in ``import_export.formats.base_formats`` e.g
from import_export.formats.base_formats import CSV, XLSX
IMPORT_FORMATS = [CSV, XLSX]

.. _EXPORT_FORMATS:

``EXPORT_FORMATS``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
10 changes: 6 additions & 4 deletions import_export/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -617,11 +617,13 @@ def get_export_data(self, file_format, queryset, *args, **kwargs):
if not self.has_export_permission(request):
raise PermissionDenied

data = self.get_data_for_export(request, queryset, *args, **kwargs)
export_data = file_format.export_data(
data,
escape_formulae=self.should_escape_formulae,
data = self.get_data_for_export(
request,
queryset,
*args,
**kwargs,
)
export_data = file_format.export_data(data)
encoding = kwargs.get("encoding")
if not file_format.is_binary() and encoding:
export_data = export_data.encode(encoding)
Expand Down
16 changes: 2 additions & 14 deletions import_export/formats/base_formats.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import html

import tablib
from django.conf import settings
from tablib.formats import registry


Expand Down Expand Up @@ -85,9 +84,7 @@ def create_dataset(self, in_stream, **kwargs):
return tablib.import_set(in_stream, format=self.get_title(), **kwargs)

def export_data(self, dataset, **kwargs):
if kwargs.pop("escape_html", None):
self._escape_html(dataset)
if kwargs.pop("escape_formulae", None):
if getattr(settings, "IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT", False) is True:
self._escape_formulae(dataset)
return dataset.export(self.get_title(), **kwargs)

Expand All @@ -103,12 +100,6 @@ def can_import(self):
def can_export(self):
return hasattr(self.get_format(), "export_set")

def _escape_html(self, dataset):
for _ in dataset:
row = dataset.lpop()
row = [html.escape(str(cell)) for cell in row]
dataset.append(row)

def _escape_formulae(self, dataset):
def _do_escape(s):
return s.replace("=", "", 1) if s.startswith("=") else s
Expand Down Expand Up @@ -165,9 +156,6 @@ class HTML(TextFormat):
TABLIB_MODULE = "tablib.formats._html"
CONTENT_TYPE = "text/html"

def export_data(self, dataset, **kwargs):
return super().export_data(dataset, **kwargs)


class XLS(TablibFormat):
TABLIB_MODULE = "tablib.formats._xls"
Expand Down
12 changes: 0 additions & 12 deletions import_export/mixins.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,18 +79,6 @@ def choose_import_resource_class(self, form):

class BaseExportMixin(BaseImportExportMixin):
model = None
escape_exported_data = False
escape_html = False
escape_formulae = False

@property
def should_escape_formulae(self):
v = getattr(
settings, "IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT", self.escape_formulae
)
if v is True:
logger.debug("IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT is enabled")
return v

def get_export_formats(self):
"""
Expand Down
8 changes: 4 additions & 4 deletions import_export/resources.py
Original file line number Diff line number Diff line change
Expand Up @@ -1095,14 +1095,14 @@ def export(self, *args, queryset=None, **kwargs):
queryset = self.get_queryset()
queryset = self.filter_export(queryset, *args, **kwargs)
headers = self.get_export_headers()
data = tablib.Dataset(headers=headers)
dataset = tablib.Dataset(headers=headers)

for obj in self.iter_queryset(queryset):
data.append(self.export_resource(obj))
dataset.append(self.export_resource(obj))

self.after_export(queryset, data, *args, **kwargs)
self.after_export(queryset, dataset, *args, **kwargs)

return data
return dataset

def _get_ordered_field_names(self, order_field):
"""
Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,11 @@ classifiers = [
"Topic :: Software Development",
]

# tablib temporarily using master before v4
dependencies = [
"diff-match-patch",
"Django>=3.2",
"tablib[html,ods,xls,xlsx,yaml]==3.5.0",
"tablib[html,ods,xls,xlsx,yaml]@git+https://github.com/jazzband/tablib.git@master#egg=tablib"
]

[project.urls]
Expand Down
4 changes: 2 additions & 2 deletions requirements/base.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Django>=3.2
# tablib temporarily pinned to 3.5.0 - see issue #1602
tablib[html,ods,xls,xlsx,yaml]==3.5.0
# # tablib temporarily using master before v4
tablib[html,ods,xls,xlsx,yaml]@git+https://github.com/jazzband/tablib.git@master#egg=tablib
diff-match-patch
31 changes: 27 additions & 4 deletions tests/core/tests/test_admin_integration.py
Original file line number Diff line number Diff line change
Expand Up @@ -614,8 +614,7 @@ def test_returns_xlsx_export(self):
)

@override_settings(IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT=True)
@patch("import_export.mixins.logger")
def test_export_escape_formulae(self, mock_logger):
def test_export_escape_formulae(self):
Book.objects.create(id=1, name="=SUM(1+1)")
Book.objects.create(id=2, name="<script>alert(1)</script>")
response = self.client.get("/admin/core/book/export/")
Expand All @@ -630,8 +629,32 @@ def test_export_escape_formulae(self, mock_logger):
self.assertEqual("<script>alert(1)</script>", wb.active["B2"].value)
self.assertEqual("SUM(1+1)", wb.active["B3"].value)

mock_logger.debug.assert_called_once_with(
"IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT is enabled"
@override_settings(IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT=True)
def test_export_escape_formulae_csv(self):
b1 = Book.objects.create(id=1, name="=SUM(1+1)")
response = self.client.get("/admin/core/book/export/")
self.assertEqual(response.status_code, 200)

index = self._get_input_format_index("csv")
data = {"file_format": str(index)}
response = self.client.post("/admin/core/book/export/", data)
self.assertIn(
f"{b1.id},SUM(1+1),,,0,,,,,\r\n".encode(),
response.content,
)

@override_settings(IMPORT_EXPORT_ESCAPE_FORMULAE_ON_EXPORT=False)
def test_export_escape_formulae_csv_false(self):
b1 = Book.objects.create(id=1, name="=SUM(1+1)")
response = self.client.get("/admin/core/book/export/")
self.assertEqual(response.status_code, 200)

index = self._get_input_format_index("csv")
data = {"file_format": str(index)}
response = self.client.post("/admin/core/book/export/", data)
self.assertIn(
f"{b1.id},=SUM(1+1),,,0,,,,,\r\n".encode(),
response.content,
)


Expand Down