This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- ``exclude`` - when matched, results are excluded from the index, as if they do not exist. User will receive a 404.
- ``block`` - when matched, results are not excluded from the index, marked with ``access: block``, but access to the actual is blocked. User will see a 451
- ``allow`` - full access to the index and the resource.
- ``block`` - when matched, results are not excluded from the index, but access to the actual content is blocked. User will see a 451.
- ``allow`` - full access to the index and the resource, but may be overriden by embargo
- ``allow_ignore_embargo`` - full access to the index and resource, overriding any embargo settings
The difference between ``exclude`` and ``block`` is that when blocked, the user can be notified that access is blocked, while
with exclude, no trace of the resource is presented to the user.
The use of ``allow`` is useful to provide access to more specific resources within a broader block/exclude rule.
The use of ``allow`` is useful to provide access to more specific resources within a broader block/exclude rule, while ``allow_ignore_embargo``
can be used to override any embargo settings.
If both are present, the embargo restrictions are checked first and take precedence, unless the ``allow_ignore_embargo`` option is used
to override the embargo.
User-Based Access Controls
^^^^^^^^^^^^^^^^^^^^^^^^^^
The access control rules can further be customized be specifying different permissions for different 'users'. Since pywb does not have a user system,
a special header, ``X-Pywb-ACL-User`` can be used to indicate a specific user.
This setting is designed to allow a more privileged user to access additional content or override an embargo.
For example, the following access control settings restrict access to ``https://example.com/restricted/`` by default, but allow access for the ``staff`` user::
Combined with the embargo settings, this can also be used to override the embargo for internal organizational users, while keeping the embargo for general access::
To make this work, pywb must be running behind an Apache or Nginx system that is configured to set ``X-Pywb-ACL-User: staff`` based on certain settings.
For example, this header may be set based on IP range, or based on password authentication.
Further examples of how to set this header will be provided in the deployments section.
**Note: Do not use the user-based rules without configuring proper authentication on an Apache or Nginx frontend to set or remove this header, otherwise the 'X-Pywb-ACL-User' can easily be faked.**
See the :ref:`config-acl-header` section in Usage for examples on how to configure this header.
Access Error Messages
^^^^^^^^^^^^^^^^^^^^^
The special error code 451 is used to indicate that a resource has been blocked (access setting ``block``)
The special error code 451 is used to indicate that a resource has been blocked (access setting ``block``).
The `error.html <https://github.com/webrecorder/pywb/blob/master/pywb/templates/error.html>`_ template contains a special message for this access and can be customized further.
Expand All
@@ -61,7 +173,7 @@ The .aclj files need not ever be added or edited manually.
The pywb ``wb-manager`` utility has been extended to provide tools for adding, removing and checking access control rules.
The access rules are written to ``<collection>/acl/access-rules.acl`` for a given collection ``<collection>`` for automatic collections.
The access rules are written to ``<collection>/acl/access-rules.aclj`` for a given collection ``<collection>`` for automatic collections.
For example, to add the first line to an ACL file ``access.aclj``, one could run::
Expand All
@@ -73,6 +185,11 @@ The URL supplied can be a URL or a SURT prefix. If a SURT is supplied, it is use
wb-manager acl add <collection> com, allow
A specific user for user-based rules can also be specified, for example to add ``allow_ignore_embargo`` for user ``staff`` only, run::
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The required ``source_coll`` setting specifies the source collection from which to load content that will be recorded.
Most likely this will be the :ref:`live-web` collection, which should also be defined.
Expand DownExpand Up
@@ -341,6 +342,23 @@ When any dedup_policy, pywb can also access the dedup Redis index, along with an
This feature is still experimental but should generally work. Additional options for working with the Redis Dedup index will be added in the futuer.
.. _put-custom-record:
Adding Custom Resource Records
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pywb now also supports adding custom data to a WARC ``resource`` record. This can be used to add custom resources, such as screenshots, logs, error messages,
etc.. that are not normally captured as part of recording, but still useful to store in WARCs.
To add a custom resources, simply call ``PUT /<coll>/record`` with the data to be added as the request body and the type of the data specified as the content-type. The ``url`` can be specified as a query param.
For example, adding a custom record ``file:///my-custom-resource`` containing ``Some Custom Data`` can be done using ``curl`` as follows::
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pywb supports configuring different language locales and loading different language translations, and dynamically switching languages.
pywb can extract all text from templates and generate CSV files for translation and convert them back into a binary format used for localization/internationalization.
(pywb uses the `Babel library <http://babel.pocoo.org/en/latest/>`_ which extends the `standard Python i18n system <https://docs.python.org/3/library/gettext.html>`_)
To ensure all localization related dependencies are installed, first run::
pip install pywb[i18n]
Locales to use are configured in the ``config.yaml``.
The command-line ``wb-manager`` utility provides a way to manage locales for translation, including generating extracted text, and to update translated text.
Adding a Locale and Extracting Text
===================================
To add a new locale for translation and automatically extract all text that needs to be translated, run::
wb-manager i18n extract <loc>
The ``<loc>`` can be one or more supported two-letter locales or CLDR language codes. To list available codes, you can run ``pybabel --list-locales``.
Localization data is placed in the ``i18n`` directory, and translatable strings can be found in ``i18n/translations/<locale>/LC_MESSAGES/messages.csv``
Each CSV file looks as follows, listing each source string and an empty string for the translated version::
"location","source","target"
"pywb/templates/banner.html:6","Live on",""
"pywb/templates/banner.html:8","Calendar icon",""
"pywb/templates/banner.html:9 pywb/templates/query.html:45","View All Captures",""
This CSV can then be passed to translators to translate the text.
(The extraction parameters are configured to load data from ``pywb/templates/*.html`` in ``babel.ini``)
For example, the following will generate translation strings for ``es`` and ``pt`` locales::
wb-manager i18n extract es pt
The translatable text can then be found in ``i18n/translations/es/LC_MESSAGES/messages.csv`` and ``i18n/translations/pt/LC_MESSAGES/messages.csv``.
The CSV files should be updated with a translation for each string in the ``target`` column.
The extract command adds any new strings without overwriting existing translations, so after running the update command to compile translated strings (described below), it is safe to run the extract command again.
Updating Locale Catalog
=======================
Once the text has been translated, and the CSV files updated, simply run::
wb-manager i18n update <loc>
This will parse the CSVs and compile the translated string tables for use with pywb.
Specifying locales in pywb
==========================
To enable the locales in pywb, one or more locales can be added to the ``locales`` key in ``config.yaml``, ex::
locales:
- en
- es
Single Language Default Locale
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pywb can be configured with a default, single-language locale, by setting the ``default_locale`` property in ``config.yaml``::
default_locale: es
locales:
- es
With this configuration, pywb will automatically use the ``es`` locale for all text strings in pywb pages.
pywb will also set the ``<html lang="es">`` so that the browser will recognize the correct locale.
Mutli-language Translations
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If more than one locale is specified, pywb will automatically show a language switching UI at the top of collection and search pages, with an option
for each locale listed. To include English as an option, it should also be added as a locale (and no strings translated). For example::
locales:
- en
- es
- pt
will configure pywb to show a language switch option on all pages.
Localized Collection Paths
==========================
When localization is enabled, pywb supports the locale prefix for accessing each collection with a localized language:
If pywb has a collection ``my-web-archive``, then:
* ``/my-web-archive/`` - loads UI with default language (set via ``default_locale``)
* ``/en/my-web-archive/`` - loads UI with ``en`` locale
* ``/es/my-web-archive/`` - loads UI with ``es`` locale
* ``/pt/my-web-archive/`` - loads UI with ``pt`` locale
The language switch options work by changing the locale prefix for the same page.
Listing and Removing Locales
============================
To list the locales that have previously been added, you can also run ``wb-manager i18n list``.
To disable a locale from being used in pywb, simply remove it from the ``locales`` key in ``config.yaml``.
To remove data for a locale permanently, you can run: ``wb-manager i18n remove <loc>``. This will remove the locale directory on disk.
To remove all localization data, you can manually delete the ``i18n`` directory.
UI Templates: Adding Localizable Text
=====================================
Text that can be translated, localizable text, can be marked as such directly in the UI templates:
1. By wrapping the text in ``{% trans %}``/``{% endtrans %}`` tags. For example::
{% trans %}Collection {{ coll }} Search Page{% endtrans %}
2. Short-hand by calling a special ``_()`` function, which can be used in attributes or more dynamically. For example::
... title="{{ _('Enter a URL to search for') }}">
These methods can be used in all UI templates and are supported by the Jinja2 templating system.
See :ref:`ui-customizations` for a list of all available UI templates.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@@ -51,7 +51,7 @@ See the :ref:`nginx-deploy` and :ref:`apache-deploy` sections for more info on d
Working Docker Compose Examples
-------------------------------
The pywb `Deployment Examples <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/>`_ include working examples of deploying pywb with Nginx, Apache and OutbackCDX
The pywb `Deployment Examples <https://github.com/webrecorder/pywb/blob/main/sample-deploy/>`_ include working examples of deploying pywb with Nginx, Apache and OutbackCDX
in Docker using Docker Compose, widely available container orchestration tools.
See `Installing Docker <https://docs.docker.com/get-docker/>`_ and `Installing Docker Compose <https://docs.docker.com/compose/install/>`_ for instructions on how to install these tools.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pywb provides customizable rewriting based on content-type, the available types are configured
in the :py:mod:`pywb.rewriter.default_rewriter`, which specifies rewriter classes per known type,
in the :py:mod:`pywb.rewrite.default_rewriter`, which specifies rewriter classes per known type,
and mapping of content-types to rewriters.
Expand All
@@ -118,6 +118,7 @@ JS Rewriting
The JS rewriter is applied to inline ``<script>`` blocks, or inline attribute js, and any files determine to be javascript (based on content type and ``js_`` modifier).
The default JS rewriter does not rewrite any links. Instead, JS rewriter performs limited regular expression on the following:
* ``postMessage`` calls
* certain ``this`` property accessors
* specific ``location =`` assignment
Expand All
@@ -126,7 +127,7 @@ Then, the entire script block is wrapped in a special code block to be executed
The server-side rewriting is to aid the client-side execution of wrapped code.
For more information, see :py:mod:`pywb.rewriter.regex_rewriters.JSWombatProxyRewriterMixin`
For more information, see :py:mod:`pywb.rewrite.regex_rewriters.JSWombatProxyRewriterMixin`
JSONP Rewriting
Expand All
@@ -140,13 +141,13 @@ For example, a requested url might be ``/my-coll/http://example.com?callback=jQu
To ensure the JSONP callback works as expected, the content is rewritten to ``jQuery123(...)`` -> ``jQuery456(...)``
For more information, see :py:mod:`pywb.rewriter.jsonp_rewriter`
For more information, see :py:mod:`pywb.rewrite.jsonp_rewriter`
DASH and HLS Rewriting
~~~~~~~~~~~~~~~~~~~~~~
To support recording and replaying, adaptive streaming formants (DASH and HLS), pywb can perform special rewriting on the manifests for these formats to remoe all but one possible resolution/format. As a result, the non-deterministic format selection is reduced to a single consistent format.
For more information, see :py:mod:`pywb.rewriter.rewrite_hls` and :py:mod:`pywb.rewriter.rewrite_dash` and the tests in ``pywb/rewrite/test/test_content_rewriter.py``
For more information, see :py:mod:`pywb.rewrite.rewrite_hls` and :py:mod:`pywb.rewrite.rewrite_dash` and the tests in ``pywb/rewrite/test/test_content_rewriter.py``
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@@ -293,6 +293,50 @@ Then, in your config, simply include:
The configuration assumes uwsgi is started with ``uwsgi uwsgi.ini``
.. _config-acl-header:
Configuring Access Control Header
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The :ref:`access-control` system allows users to be granted different access settings based on the value of an ACL header, ``X-pywb-ACL-user``.
The header can be set via Nginx or Apache to grant custom access priviliges based on IP address, password, or other combination of rules.
For example, to set the value of the header to ``staff`` if the IP of the request is from designated local IP ranges (127.0.0.1, 192.168.1.0/24), the following settings can be added to the configs:
For Nginx::
geo $acl_user {
# ensure user is set to empty by default
default "";
# optional: add IP ranges to allow privileged access
127.0.0.1 "staff";
192.168.0.0/24 "staff";
}
...
location /wayback/ {
...
uwsgi_param HTTP_X_PYWB_ACL_USER $acl_user;
}
For Apache::
<If "-R '192.168.1.0/24' || -R '127.0.0.1'">
RequestHeader set X-Pywb-ACL-User staff
</If>
# ensure header is cleared if no match
<Else>
RequestHeader set X-Pywb-ACL-User ""
</Else>
}
Running on Subdirectory Path
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All
@@ -313,7 +357,7 @@ Deployment Examples
The ``sample-deploy`` directory includes working Docker Compose examples for deploying pywb with Nginx and Apache on the ``/wayback`` subdirectory.
See:
- `Docker Compose Nginx <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/docker-compose-nginx.yaml>`_ for sample Nginx config.
- `Docker Compose Apache <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/docker-compose-apache.yaml>`_ for sample Apache config.
- `uwsgi_subdir.ini <https://github.com/webrecorder/pywb/blob/docs/sample-deploy/uwsgi_subdir.ini>`_ for example subdirectory uwsgi config.
- `Docker Compose Nginx <https://github.com/webrecorder/pywb/blob/main/sample-deploy/docker-compose-nginx.yaml>`_ for sample Nginx config.
- `Docker Compose Apache <https://github.com/webrecorder/pywb/blob/main/sample-deploy/docker-compose-apache.yaml>`_ for sample Apache config.
- `uwsgi_subdir.ini <https://github.com/webrecorder/pywb/blob/main/sample-deploy/uwsgi_subdir.ini>`_ for example subdirectory uwsgi config.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters