-
Notifications
You must be signed in to change notification settings - Fork 222
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixes bug 789639: Implement new Socorro filesystem classes.
This implements four classes compatible with the Socorro crash storage API. * `FSRadixTreeStorage` This stores crashes under `YYYYMMDD/"name"/radix.../crash_id`. It provides processed crash storage but does not support finding new crashes. * `FSDatedRadixTreeStorage` This composes `FSRadixTreeStorage` to support additional referencing to crashes via `YYYYMMDD/"date"/HH/MM_SS/crash_id` that link to the actual directory stored in the `FSRadixTreeStorage`. It also provides a reverse symlink inside the `FSRadixTreeStorage` called `date_root` to link back to the `MM_SS` folder. This does not provide processed crash storage but supports finding new crashes. * `PrimaryDeferredStorage` This composes two storages that support the crash storage API and designates incoming crashes based on a `deferral_criteria` parameter to be stored in one of the two storages. This provides processed crash storage and finding new crashes. * `PrimaryDeferredProcessedStorage` This composes three storages, doing the same as above but all processed crashes are stored in the same crash storage.
- Loading branch information
Tony Young
committed
Feb 11, 2013
1 parent
f644856
commit b5e4cbc
Showing
15 changed files
with
1,309 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
New Filesystem | ||
============== | ||
|
||
The new filesystem module (``socorro.external.fs``) is a rewrite of | ||
``socorro.external.filesystem`` to use the new, more consistent crash storage | ||
API. It consists of two crash storage classes: ``FSRadixTreeStorage`` and | ||
``FSDatedRadixTreeStorage``. | ||
|
||
``FSRadixTreeStorage`` | ||
---------------------- | ||
|
||
.. image:: fs-fsradixtreestorage.png | ||
|
||
This storage class employs a radix scheme taking the hex digits in pairs from | ||
the start of crash_id. For example, a crash_id that looks like | ||
``38a4f01e...20090514`` would be stored in a directory structure like this:: | ||
|
||
.../20090514/name/38/a4/f0/1e/38a4f01e...20090514 | ||
|
||
The depth of directory is specified by the seventh directory from the right, | ||
i.e. the first 0 in 2009 in the example. By default, if the value is 0, the | ||
nesting is 4. | ||
|
||
The leaf directory contains the raw crash information, exported as JSON, and | ||
the various associated dump files -- or, if being used as processed storage, | ||
contains the processed JSON file. | ||
|
||
``FSDatedRadixTreeStorage`` | ||
--------------------------- | ||
|
||
.. image:: fs-fsdatedradixtreestorage.png | ||
|
||
This storage class extends ``FSRadixTreeStorage`` to include a date branch. | ||
The date branch implements an indexing scheme so that the rough order in | ||
which the crashes arrived is known. The directory structure consists of the | ||
hour, the minute and the interval of seconds the crash was received for | ||
processing -- for instance, if a crash was received at 3:30:12pm, the directory | ||
structure would look something like:: | ||
|
||
.../20090514/date/15/30_03/38a4f01e...20090514 | ||
|
||
(the 03 in 30_03 corresponds to an interval slice of 4: 12 // 4 = 3) | ||
|
||
In the example, the date 20090514 corresponds to the date assigned by the | ||
collector from the crash's ID, rather than the date the crash was received by | ||
the processor. | ||
|
||
The crash ID in the dated folder is a symbolic link to the same folder in the | ||
radix tree, e.g. the directory given in the example would be linked to | ||
``.../20090514/name/38/a4/f0/1e/38a4f01e...20090514``. A corresponding link, | ||
named ``date_root``, is created in the folder which is linked to | ||
``.../20090514/date/15/30_03``. This is so that jumps can be made between the | ||
two directory hierarchies -- crash data can be obtained by visiting the dated | ||
hierarchy, and a crash's location in the dated hierarchy can be found by | ||
looking the crash up by its ID. | ||
|
||
This dated directory structure enables efficient traversal of the folder | ||
hierarchy for new crashes -- first, all the date directories in the root are | ||
traversed to find all the symbolic links to the radix directories. When one is | ||
found, it is unlinked from the filesystem and the ID yielded to the interested | ||
caller. This proceeds until we exhaust all the directories to visit, by which | ||
time all the crashes should be visited. | ||
|
||
In order to prevent race conditions, the process will compute the current slot | ||
and decline to enter any slots with a number greater than the current slot -- | ||
this is because a process may already be currently writing to it. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# This Source Code Form is subject to the terms of the Mozilla Public | ||
# License, v. 2.0. If a copy of the MPL was not distributed with this | ||
# file, You can obtain one at http://mozilla.org/MPL/2.0/. | ||
|
||
from configman import Namespace | ||
from configman.converters import classes_in_namespaces_converter | ||
from configman.dotdict import DotDict | ||
|
||
from socorro.cron.base import BaseCronApp | ||
|
||
from socorro.lib.datetimeutil import utc_now | ||
|
||
import os | ||
import shutil | ||
|
||
|
||
class RadixCleanupCronApp(BaseCronApp): | ||
app_name = 'cleanup_radix' | ||
app_description = 'Cleans up dead radix directories' | ||
|
||
required_config = Namespace() | ||
required_config.add_option( | ||
'dated_storage_classes', | ||
doc='a comma delimited list of storage classes', | ||
default='', | ||
from_string_converter=classes_in_namespaces_converter( | ||
template_for_namespace='storage%d', | ||
name_of_class_option='crashstorage_class', | ||
instantiate_classes=False, # we instantiate manually for thread | ||
# safety | ||
) | ||
) | ||
|
||
def __init__(self, config, *args, **kwargs): | ||
super(RadixCleanupCronApp, self).__init__(config, *args, **kwargs) | ||
self.storage_namespaces = \ | ||
config.dated_storage_classes.subordinate_namespace_names | ||
self.stores = DotDict() | ||
for a_namespace in self.storage_namespaces: | ||
self.stores[a_namespace] = \ | ||
config[a_namespace].crashstorage_class(config[a_namespace]) | ||
|
||
def run(self): | ||
today = utc_now().strftime("%Y%m%d") | ||
|
||
for storage in self.stores.values(): | ||
for date in os.listdir(storage.config.fs_root): | ||
if date >= today: | ||
continue # don't process today's crashes or any crashes | ||
# from the future | ||
|
||
if os.listdir(os.sep.join([storage.config.fs_root, date, | ||
storage.config.date_branch_base])): | ||
self.config.logger.error("Could not delete crashes for " | ||
"date %s: branch isn't empty", | ||
date) | ||
continue # if the date branch isn't empty, then it's not | ||
# safe to nuke | ||
|
||
shutil.rmtree(os.sep.join([storage.config.fs_root, date])) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.