Skip to content

Commit

Permalink
add overview to docs describing mfr architecture
Browse files Browse the repository at this point in the history
 * New file available: overview.rst.  This describes the structure of
   MFR, specifically the three most important types of components:
   Handlers, Providers, and Extensions.

 * Fill in docstrings here and there.
  • Loading branch information
felliott committed Aug 18, 2016
1 parent 141fe5a commit b4242db
Show file tree
Hide file tree
Showing 8 changed files with 101 additions and 2 deletions.
4 changes: 3 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ Release v\ |version|. (:ref:`Installation <install>`)
Ready to dive in?
-----------------

Go on to the :ref:`Quickstart tutorial <quickstart>` or check out some :ref:`examples <examples>`.
Go to the :ref:`Quickstart tutorial <quickstart>`, check out some :ref:`examples <examples>` for code, or see the :ref:`Overview <overview>` to understand its architecture.



Guide
Expand All @@ -23,6 +24,7 @@ Guide
install
quickstart
examples
overview
code

Project info
Expand Down
63 changes: 63 additions & 0 deletions docs/overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
.. _overview:

Overview
========

Modular File Renderer (MFR) is a Python web application that provides a single interface for displaying many different types of files in a browser. If an iframe's ``src`` attribute is an MFR render url, MFR will return the html needed to display the image, document, object, etc.

There are three main categories of modules in MFR: :ref:`Handlers <handlers>`, :ref:`Providers <providers>`, and :ref:`Extensions <extensions>`. Handlers are the user-facing endpoints of MFR, accepting HTTP requests and returning either HTML or the file. Providers are responsible for knowing how to fetch the metadata and content for a file, given a URL to it. Extensions convert files and construct HTML to make specific file types renderable in a browser.


.. _handlers:

Handlers
--------

In MFR, **handlers** are the classes that handle the web requests made to MFR. The two most important handlers are the Render handler, which handles reequests to the ``/render`` endpoint, and the Export handler, which handles requests to the ``/export`` endpoint. There are also endpoints for handling static assets, but those will not be described here. See `mfr.server.app` and `mfr.server.core.ExtensionsStaticFileHandler` for those.

Base handler
^^^^^^^^^^^^

The **base handler** extracts the ``url`` query parameter from the request, constructs an appropriate MFR Provider object, then asks the Provider to fetch the file metadata.

Render handler
^^^^^^^^^^^^^^

The **Render handler** will construct an appropriate renderer using the Extension module that is mapped to the file's extension. Some renderers require the file contents be inserted inline (ex. code snippets needing syntax highlighting). Those will download the file via the Provider. Others will only need a url to the file, which the Extension renderer will be responsible for inserting. The output from the renderer will be cached if caching is enabled.


Export handler
^^^^^^^^^^^^^^

The **Export handler** takes the ``url`` to the file and a ``format`` query parameter, and constructs an Extension exporter to convert the file into the requested format. For example, most browsers can't render ``.docx`` files directly, so the Extension exporter will convert it to a PDF. The Export handler can also cache results if caching is enabled.


.. _providers:

Providers
---------

The **Provider** is responsible for knowing how to take a url to a file and get both the content and metadata for that file.

Base provider
^^^^^^^^^^^^^

Does little except verifying that the url is hosted at a supported domain.

HTTP provider
^^^^^^^^^^^^^

Naive provider that infers file metadata (extension, type, etc.) from the url. Downloads by issuing GET request against the url.

OSF provider
^^^^^^^^^^^^

`Open Science Framework <https://osf.io/>`_ -aware provider that can convert the given url into a WaterButler url. WaterButler is a file action abstraction service that can be used to fetch metadata and download file contents. The OSF provider also knows how to pass through OSF credentials, to enforce read-access restrictions.


.. _extensions:

Extensions
----------

**Extensions** are the modules that generate the HTML needed to render a given file type. They may also provide exporters if the file's native type is unrenderable and needs to be converted to another format suitable for browsers. Extension renderers inherit from `mfr.core.extension.BaseRenderer` and exporters inherit from `mfr.core.extension.BaseExporter`.
4 changes: 4 additions & 0 deletions mfr/core/extension.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,10 @@ def render(self):

@abc.abstractproperty
def file_required(self):
"""Does the rendering html need the raw file content to display correctly?
Syntax-highlighted text files do. Standard image formats do not, since an <img> tag
only needs a url to the file.
"""
pass

@abc.abstractproperty
Expand Down
4 changes: 4 additions & 0 deletions mfr/core/provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@


class BaseProvider(metaclass=abc.ABCMeta):
"""Base class for MFR Providers. Requires ``download`` and ``metadata`` methods.
Validates that the given file url is hosted at a domain listed in
`mfr.server.settings.ALLOWED_PROVIDER_DOMAINS`.
"""

def __init__(self, request, url):
self.request = request
Expand Down
3 changes: 3 additions & 0 deletions mfr/providers/http/provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@


class HttpProvider(provider.BaseProvider):
"""Basic MFR provider. Infers file metadata (extension, type) from the url. Downloads by
issuing a GET to the url.
"""

async def metadata(self):
path = urlparse(self.url).path
Expand Down
16 changes: 16 additions & 0 deletions mfr/providers/osf/provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@


class OsfProvider(provider.BaseProvider):
"""Open Science Framework (https://osf.io) -aware provider. Knows the OSF ecosystem and
can request specific metadata for the file referenced by the URL. Can correctly propagate
OSF authorization to verify ownership and permisssions of file.
"""

UNNEEDED_URL_PARAMS = ('_', 'token', 'action', 'mode', 'displayName')

Expand All @@ -37,13 +41,18 @@ def __init__(self, request, url):
self.view_only = self.view_only[0].decode()

async def metadata(self):
"""Fetch metadata about the file from WaterButler. V0 and V1 urls must be handled
differently.
"""
download_url = await self._fetch_download_url()
if '/file?' in download_url:
# URL is for WaterButler v0 API
# TODO Remove this when API v0 is officially deprecated
metadata_url = download_url.replace('/file?', '/data?', 1)
metadata_request = await self._make_request('GET', metadata_url)
metadata = await metadata_request.json()
else:
# URL is for WaterButler v1 API
metadata_request = await self._make_request('HEAD', download_url)
# To make changes to current code as minimal as possible
metadata = {'data': json.loads(metadata_request.headers['x-waterbutler-metadata'])['attributes']}
Expand All @@ -68,6 +77,7 @@ async def metadata(self):
return provider.ProviderMetadata(name, ext, content_type, unique_key, download_url)

async def download(self):
"""Download file from WaterButler, returning stream."""
download_url = await self._fetch_download_url()
headers = {settings.MFR_IDENTIFYING_HEADER: '1'}
response = await self._make_request('GET', download_url, allow_redirects=False, headers=headers)
Expand All @@ -87,6 +97,11 @@ async def download(self):
return streams.ResponseStreamReader(response, unsizable=True)

async def _fetch_download_url(self):
"""Provider needs a WaterButler URL to download and get metadata. If ``url`` is already
a WaterButler url, return that. If not, then the url points to an OSF endpoint that will
redirect to WB. Issue a GET request against it, then return the WB url stored in the
Location header.
"""
# v1 Waterbutler url provided
path = urlparse(self.url).path
if path.startswith('/v1/resources'):
Expand All @@ -109,6 +124,7 @@ async def _fetch_download_url(self):
return self.download_url

async def _make_request(self, method, url, *args, **kwargs):
"""Pass through OSF credentials."""
if self.cookies:
kwargs['cookies'] = self.cookies
if self.cookie:
Expand Down
7 changes: 7 additions & 0 deletions mfr/server/handlers/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,15 @@ def options(self):


class BaseHandler(CorsMixin, tornado.web.RequestHandler, SentryMixin):
"""Base class for the Render and Export handlers. Fetches the file metadata for the file
indicated by the ``url`` query parameter and builds the provider caches. Also handles
writing output and errors.
"""

async def prepare(self):
"""Builds an MFR provider instance, to which it passes the the ``url`` query parameter.
From that, the file metadata is extracted. Also builds cached waterbutler providers.
"""
if self.request.method == 'OPTIONS':
return

Expand Down
2 changes: 1 addition & 1 deletion mfr/server/handlers/render.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ async def prepare(self):
self.source_file_path = await self.local_cache_provider.validate_path('/render/' + str(uuid.uuid4()))

async def get(self):
"""Render a file with the extension"""
"""Return HTML that will display the given file."""
renderer = utils.make_renderer(
self.metadata.ext,
self.metadata,
Expand Down

0 comments on commit b4242db

Please sign in to comment.