Navigation Menu

Skip to content

Commit

Permalink
Hooks: add new hooks system
Browse files Browse the repository at this point in the history
Adds time write time aggregate functions and hooks.

Closes #35
  • Loading branch information
JohnAD authored and ashcrow committed May 30, 2018
1 parent 87ec361 commit 92d696d
Show file tree
Hide file tree
Showing 14 changed files with 2,007 additions and 30 deletions.
2 changes: 2 additions & 0 deletions .travis.yml
Expand Up @@ -14,3 +14,5 @@ script:
- nosetests --with-coverage --cover-package=flask_track_usage --cover-min-percentage=80 -v test/*.py
notifications:
email: false
addons:
postgresql: "9.5"
87 changes: 87 additions & 0 deletions docs/hooks.rst
@@ -0,0 +1,87 @@
Flask-Track-Usage Hooks
=======================

The library supports post-storage functions that can optionally do more after the request itself is stored.

How To Use
----------

To use, simply add the list of functions you wish to call from the context of call to TrackUsage.

For example, to add sumRemotes and sumLanguages to a MongoEngineStorage storage:

.. code-block:: python
from flask.ext.track_usage import TrackUsage
from flask.ext.track_usage.storage.mongo import MongoEngineStorage
from flask.ext.track_usage.summarization import sumRemotes, sumLanguages
t = TrackUsage(app, [MongoEngineStorage(hooks=[sumRemotes, sumLanguages])])
Standard Summary Hooks
----------------------

Time Periods for ALL Summary Hooks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When keeping live metrics for each of the summaries, the following time periods are used:

:Hour:
one common unit of "storage" is kept to keep track of the hourly traffic for each hour of a particular date.

:Date:
one common unit of "storage" is kept to keep track of the daily traffic for a particular date.

:Month:
one common unit of "storage" is kept to keep track of the monthly traffic for a particular
month. The month stored is the first day of the month. For example, the summary for March
2017 would be stored under the date of 2017-03-01.

Please note that this library DOES NOT handle expiration of old data. If you wish to delete, say, hourly data that is over 60 days old, you will need to create a seperate process to handle this. This library merely adds or updates new data and presumes limitless storage.

Summary Targets for ALL Summary Hooks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Currently, the following two data targets are summarized for each of the Time Periods.

:Hits:
The total number of requests seen.
:Transfer:
The total number of bytes transfered in response to all requests seen.

sumUrls -- URLs
~~~~~~~~~~~~~~~

Traffic is summarized for each URL requested of the Flask server.

sumRemotes -- remote IPs
~~~~~~~~~~~~~~~~~~~~~~~~

Traffic is summarized for each remote IP address seen by the Flask server.

sumUserAgents -- user agent clients
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Traffic is summarized for each client (aka web browser) seen by the Flask server.

sumLanugages -- languages
~~~~~~~~~~~~~~~~~~~~~~~~~

Traffic is summarized for each language seen in the requests sent to the Flask server.

sumServer -- site-wide server hits/traffic
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Traffic is summarized for all requests sent to the Flask server. This metric is mostly useful for diagnosing performance.

sumVisitors -- unique visitors (as tracked by cookies)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Traffic is summarized for each unique visitor of the Flask server. For this to function, the optional TRACK_USAGE_COOKIE function must be enabled in config.

This metric is limited by the cookie technology. User behavior such as switching browsers or turning on "anonymous mode" on a browser will make them appear to be multiple users.

sumGeo -- physical country of remote IPs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Traffic is summarized for the tracked geographies of remote IPs seen by the Flask server. For this to properly function, the optional TRACK_USAGE_FREEGEOIP config must be enabled. While the geography function provides a great deal of information, only the country is used for this summarization.
75 changes: 61 additions & 14 deletions docs/index.rst
@@ -1,17 +1,23 @@
Flask-Track-Usage |release|
===========================

Basic metrics tracking for your `Flask`_ application. This focuses more on ip addresses/locations rather than tracking specific users pathing through an application. No extra cookies or javascript is used for usage tracking.
Basic metrics tracking for your `Flask`_ application. The core of library is very light and focuses more on storing basic metrics such as remote ip address and user agent. No extra cookies or javascript are used for usage tracking.

* Simple. It's a Flask extension.
* Supports either include or exempt for views.
* Provides lite abstraction for data retrieval.
* Optional `freegeoip.net <http://freegeoip.net/>`_ integration including custom freegeoip installs.
* Multiple storage options available.
* Multiple storage options can be used.
* Pluggable functionality for storage instances.
* Supports Python 2.7 and 3+.

The following is optional:

* `freegeoip.net <http://freegeoip.net/>`_ integration for storing geography of the visitor.
* Unique visitor tracking if you are wanting to use Flask's cookie storage.
* Summation hooks for live count of common web analysis statistics such as hit counts.


.. _Flask: http://flask.pocoo.org/


Expand Down Expand Up @@ -59,11 +65,10 @@ Usage
from flask_track_usage.storage.output import OutputWriter

# Make an instance of the extension and put two writers
t = TrackUsage(app, [PrintWriter(), OutputWriter(
transform=lambda s: "OUTPUT: " + str(s))])

# Make an instance of the extension
t = TrackUsage(app, storage)
t = TrackUsage(app, [
PrintWriter(),
OutputWriter(transform=lambda s: "OUTPUT: " + str(s))
])

# Include the view in the metrics
@t.include
Expand All @@ -87,7 +92,7 @@ Include
app.config['TRACK_USAGE_INCLUDE_OR_EXCLUDE_VIEWS'] = 'include'
# Make an instance of the extension
t = TrackUsage(app, PrintStorage())
t = TrackUsage(app, [PrintWriter()])
from my_blueprints import a_bluprint
Expand All @@ -103,7 +108,7 @@ Exclude
app.config['TRACK_USAGE_INCLUDE_OR_EXCLUDE_VIEWS'] = 'exclude'
# Make an instance of the extension
t = TrackUsage(app, PrintStorage())
t = TrackUsage(app, [PrintWriter()])
from my_blueprints import a_bluprint
Expand All @@ -120,6 +125,8 @@ TRACK_USAGE_USE_FREEGEOIP

**Default**: False

Turn FreeGeoIP integration on or off. If set to true, then geography information is also stored in the usage logs.

TRACK_USAGE_FREEGEOIP_ENDPOINT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Values**: URL for RESTful JSON query
Expand All @@ -136,8 +143,7 @@ would resolve (with an IP of 1.2.3.4) to:

If using SQLStorage, the returned JSON is converted to a string. You will likely want to pass a field list in the URL to avoid exceeding the 128 character limit of the field.


Turn FreeGeoIP integration on or off
Set the URL prefix used to map the remote IP address of each request to a geography. The service must return a JSON response.

TRACK_USAGE_INCLUDE_OR_EXCLUDE_VIEWS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -150,9 +156,17 @@ If views should be included or excluded by default.
* When set to *exclude* each routed view must be explicitly included via decorator or blueprint include method. If a routed view is not included it will not be tracked.
* When set to *include* each routed view must be explicitly excluded via decorator or blueprint exclude method. If a routed view is not excluded it will be tracked.

TRACK_USAGE_COOKIE
~~~~~~~~~~~~~~~~~~
**Values**: True, False

**Default**: False

Turn on unique visitor tracking via cookie on or off. If True, then the unique visitor ID (a quasi-random number) is also stored in the usage logs.

Storage
-------
The following are built in, ready to use storage backends.
The following are built-in, ready-to-use storage backends.

.. note:: Inputs for set_up should be passed in __init__ when creating a storage instance

Expand Down Expand Up @@ -208,8 +222,8 @@ sql.SQLStorage
:members:
:inherited-members:

Retrieving Data
---------------
Retrieving Log Data
-------------------
All storage backends, other than printer.PrintStorage, provide get_usage.

.. autoclass:: flask_track_usage.storage.Storage
Expand Down Expand Up @@ -246,3 +260,36 @@ Results that are returned from all instances of get_usage should **always** look
.. versionchanged:: 1.1.0
xforwardfor item added directly after remote_addr

Hooks
-----
The basic function of the library simply logs on unit of information per request received. This keeps it simple and light.

However, you can also add post-storage "hooks" that are called after the individual log is stored. In theory, anything could be triggered after the storage.

.. code-block:: python
# ...
def helloWorld(*kwargs):
print "hello world!"
# Make an instance of the extension
t = TrackUsage(app, [PrintWriter(hooks=[helloWorld])])
In this example, the helloWorld function would be called once each time PrintWriters output is invoked. The keyword parameters are those found in the `Retrieving Log Data`_ function. (see above) Some Storages/Writers also add more keys.

This library has a list of standardized hooks that are used for log summarizing. They are documented in detail here:

:doc:`hooks`
Standard Summarization Hooks

Not all Stores support all of these hooks. See the details for more information. Usage is fairly straightforward:

.. code-block:: python
from flask.ext.track_usage import TrackUsage
from flask.ext.track_usage.storage.mongo import MongoEngineStorage
from flask.ext.track_usage.summarization import sumBasic
t = TrackUsage(app, [MongoEngineStorage(hooks=[sumBasic])])
19 changes: 16 additions & 3 deletions src/flask_track_usage/__init__.py
Expand Up @@ -52,20 +52,27 @@ class TrackUsage(object):
Tracks basic usage of Flask applications.
"""

def __init__(self, app=None, storage=None):
def __init__(self, app=None, storage=None, _fake_time=None):
"""
Create the instance.
:Parameters:
- `app`: Optional app to use.
- `storage`: If app is set you must pass the storage callables now.
- `storage`: If app is set, required list of storage callables.
"""
#
# `_fake_time` is to force the time stamp of the request for testing
# purposes. It is not normally used by end users. Must be a native
# datetime object.
#
self._exclude_views = set()
self._include_views = set()

if callable(storage):
storage = [storage]

self._fake_time = _fake_time

if app is not None and storage is not None:
self.init_app(app, storage)

Expand Down Expand Up @@ -141,9 +148,15 @@ def after_request(self, response):
speed = float("%s.%s" % (
speed_result.seconds, speed_result.microseconds))

if self._fake_time:
current_time = self._fake_time
else:
current_time = now

data = {
'url': ctx.request.url,
'user_agent': ctx.request.user_agent,
'server_name': ctx.app.name,
'blueprint': ctx.request.blueprint,
'view_args': ctx.request.view_args,
'status': response.status_code,
Expand All @@ -154,7 +167,7 @@ def after_request(self, response):
'ip_info': None,
'path': ctx.request.path,
'speed': float(speed),
'date': int(time.mktime(now.timetuple())),
'date': int(time.mktime(current_time.timetuple())),
'content_length': response.content_length,
'request': "{} {} {}".format(
ctx.request.method,
Expand Down
57 changes: 56 additions & 1 deletion src/flask_track_usage/storage/__init__.py
Expand Up @@ -32,6 +32,8 @@
Simple storage callables package.
"""

import inspect


class _BaseWritable(object):
"""
Expand All @@ -47,7 +49,27 @@ def __init__(self, *args, **kwargs):
- `args`: All non-keyword arguments.
- `kwargs`: All keyword arguments.
"""
# if "hooks" in kwargs:
# self._temp_hooks = kwargs["hooks"]
# del kwargs["hooks"]
# else:
# self._temp_hooks = []
#
self.set_up(*args, **kwargs)
#
# instantiate each hook if not already instantiated
kwargs["_parent_class_name"] = self.__class__.__name__
kwargs['_parent_self'] = self
self._post_storage_hooks = []
for hook in kwargs.get("hooks", []):
if inspect.isclass(hook):
self._post_storage_hooks.append(hook(**kwargs))
else:
self._post_storage_hooks.append(hook)
# call setup for each hook
for hook in self._post_storage_hooks:
hook.set_up(**kwargs)
self._temp_hooks = None

def set_up(self, *args, **kwargs):
"""
Expand All @@ -65,17 +87,50 @@ def store(self, data):
:Parameters:
- `data`: Data to store.
:Returns:
A dictionary representing, at minimum, the original 'data'. But
can also include information that will be of use to any hooks
associated with that storage class.
"""
raise NotImplementedError('store must be implemented.')

def get_sum(
self,
hook,
start_date=None,
end_date=None,
limit=500,
page=1,
target=None
):
"""
Queries a subtending hook for summarization data. Can be overridden.
:Parameters:
- 'hook': the hook 'class' or it's name as a string
- `start_date`: datetime.datetime representation of starting date
- `end_date`: datetime.datetime representation of ending date
- `limit`: The max amount of results to return
- `page`: Result page number limited by `limit` number in a page
- 'target': search string to limit results; meaning depend on hook
.. versionchanged:: 2.0.0
"""
pass

def __call__(self, data):
"""
Maps function call to store.
:Parameters:
- `data`: Data to store.
"""
return self.store(data)
self.store(data)
data["_parent_class_name"] = self.__class__.__name__
data['_parent_self'] = self
for hook in self._post_storage_hooks:
hook(**data)
return data


class Writer(_BaseWritable):
Expand Down

0 comments on commit 92d696d

Please sign in to comment.