Permalink
Browse files

Cleaned up database documentation headers

  • Loading branch information...
1 parent b43ec62 commit a06de2cdffb01a548aebb6b3d0821149e92e6ed8 @selenamarie selenamarie committed Oct 23, 2012
View
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -14,30 +14,30 @@ A materialized view, or "matview" is the results of a query stored as a table in
The rest of this guide assumes that all three conditions above are true. For matviews for which one or more conditions are not true, consult the PostgreSQL DBAs for your matview.
Do I Want a Matview?
-====================
+--------------------
Before proceeding to construct a new matview, test the responsiveness of simply running a query over reports_clean and/or reports_user_info. You may find that the query returns fast enough ( < 100ms ) without its own matview. Remember to test the extreme cases: Firefox release version on Windows, or Fennec aurora version.
Also, matviews are really only effective if they are smaller than 1/4 the size of the base data from which they are constructed. Otherwise, it's generally better to simply look at adding new indexes to the base data. Try populating a couple days of the matview, ad-hoc, and checking its size (pg_total_relation_size()) compared to the base table from which it's drawn. The new signature summaries was a good example of this; the matviews to meet the spec would have been 1/3 the size of reports_clean, so we added a couple new indexes to reports_clean instead.
Components of a Matview
-=======================
+-----------------------
In order to create a new matview, you will create or modify five or six things:
-1. a table to hold the matview data
-2. an update function to insert new matview data once per day
-3. a backfill function to backfill one day of the matview
-4. add a line in the general backfill_matviews function
-5. if the matview is to be backfilled from deployment, a script to do this
-6. a test that the matview is being populated correctly.
+# a table to hold the matview data
+# an update function to insert new matview data once per day
+# a backfill function to backfill one day of the matview
+# add a line in the general backfill_matviews function
+# if the matview is to be backfilled from deployment, a script to do this
+# a test that the matview is being populated correctly.
-Point (6) is not yet addressed by a test framework for Socorro, so we're skipping it currently.
+The final point is not yet addressed by a test framework for Socorro, so we're skipping it currently.
For the rest of this doc, please refer to the template matview code sql/templates/general_matview_template.sql in the Socorro source code.
Creating the Matview Table
-==========================
+--------------------------
The matview table should be the basis for the report or screen you want. It's important that it be able to cope with all of the different filter and grouping criteria which users are allowed to supply. On the other hand, most of the time it's not helpful to try to have one matview support several different reports; the matview gets bloated and slow.
@@ -59,7 +59,7 @@ So, as an example, we're going to create a simple matview for summarizing crashe
report_date
report_count
key product_version, domain, report_date
-
+
We actually use the custom procedure create_table_if_not_exists() to create this. This function handles idempotence, permissions, and secondary indexes for us, like so:
::
@@ -224,24 +224,5 @@ file and look like this:
END LOOP;
END;$f$;
-
-This script would then be checked into the set of upgrade scripts
-for that version of the database.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+This script would then be checked into the set of upgrade scripts for that version of the database.
@@ -14,7 +14,7 @@ All functions below return BOOLEAN, with TRUE meaning completion, and
throw an ERROR if they fail, unless otherwise noted.
MatView Functions
-=================
+-----------------
These functions manage the population of the many Materialized Views
in Socorro. In general, for each matview there are two functions
@@ -175,7 +175,7 @@ Functions marked "last day only" do not accumulate data, but display it only for
day they were run. As such, there is no need to fill them in for each day.
Other Matview Functions
-=======================
+-----------------------
Matview functions which don't fit the parameters above include:
@@ -288,7 +288,7 @@ Called By: other udpate functions
Schema Management Functions
-===========================
+----------------------------
These functions support partitioning, upgrades, and other management
of tables and views.
@@ -466,7 +466,7 @@ Notes: drop_old_partitions assumes a table_YYYYMMDD naming format.
Other Administrative Functions
-==============================
+------------------------------
add_old_release
---------------
@@ -549,12 +549,3 @@ release_throttle
If throttling back the number of release crashes processed, set here
Notes: add_new_product will return FALSE rather than erroring if the product already exists.
-
-
-
-
-
-
-
-
-
@@ -10,7 +10,7 @@ PostgreSQL database which are useful for application development, but
do not fit in the "Admin" or "Datetime" categories.
Formatting Functions
-====================
+--------------------
build_numeric
-------------
@@ -45,7 +45,7 @@ Takes a numeric build_id and returns the date of the build.
API Functions
-=============
+-------------
These functions support the middleware, making it easier to look up
certain things in the database.
@@ -71,7 +71,7 @@ Takes a product name and a list of version_strings, and returns an array (list)
WHERE product_version_id = ANY ( $list );
Mathematical Functions
-======================
+----------------------
These functions do math operations which we need to do repeatedly, saving some typing.
@@ -91,7 +91,7 @@ Returns the "crashes per hundred ADU", by this formula:
( crashes / throttle ) * 100 / adu
Internal Functions
-==================
+------------------
These functions are designed to be called by other functions, so are sparsely documented.
@@ -2,13 +2,6 @@
.. _databaseschema-chapter:
-Out-of-Date Data Warning
-========================
-
-While portions of this doc are still relevant and interesting for
-current socorro usage, be aware that it is extremely out of date
-when compared to current schema.
-
Database Schema
===============
@@ -24,10 +17,11 @@ The tables can be divided into three major categories: crash data,
aggregate reporting and process control.
-crash data
-----------
+Core crash data diagram
+=======================
.. image:: core-socorro.png
+ :width: 600px
reports
-------
@@ -126,20 +120,23 @@ Partitioned Child Table
Inherits: extensions
- Materialized View Reporting
- ===========================
+Materialized View Reporting
+===========================
.. image:: matviews-socorro.png
+ :width: 600px
Monitor, Processors and crontabber tables
=========================================
.. image:: helper-socorro.png
+ :width: 600px
Admin tables
============
.. image:: admin-socorro.png
+ :width: 600px
@@ -10,7 +10,7 @@ which are used to manage socorro in a staging and development environment, as we
deploy upgrades. These scripts are detailed below.
Upgrade Scripts
-===============
+---------------
These scripts are used on a weekly basis to upgrade the various socorro PostgreSQL database servers.
@@ -73,7 +73,7 @@ be run by the database superuser and won't run otherwise.
MiniDB Scripts
-==============
+--------------
This directory contains scripts for extracting and loading a smaller copy of the socorro PostgreSQL database ... called a "MiniDB" ... from production data. This MiniDB is used for testing and staging.
@@ -145,4 +145,4 @@ Creates a copy of /pgdata/9.0/data for backup so that it can be restored later f
postsql directory
-----------------
-Contains several SQL scripts which create database objects which error out during load due to broken dependencies, particularly views based on matviews. postsql.sh shell script calls these. Intended to be called by loadMiniDBonDev.py.
+Contains several SQL scripts which create database objects which error out during load due to broken dependencies, particularly views based on matviews. postsql.sh shell script calls these. Intended to be called by loadMiniDBonDev.py.
Oops, something went wrong.

0 comments on commit a06de2c

Please sign in to comment.