Skip to content
This repository has been archived by the owner on Jan 31, 2019. It is now read-only.

Commit

Permalink
Fixed a hickup with the new Firefox mobile version being mistaken for…
Browse files Browse the repository at this point in the history
… actual Firefox in the start query, leading to the behavior described in bug 603215.

Added some docs on the sites database might will be helpful in approaching such problems in the future, should they occur again.
  • Loading branch information
x1B committed Oct 11, 2010
1 parent 8ed3640 commit 2c003f5
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 7 deletions.
37 changes: 37 additions & 0 deletions apps/website_issues/README.md
@@ -0,0 +1,37 @@
The way the database is designed, might benefit from an explanation so that it
does not appear arbitrary, and might be easier / more satisfying to work with:

The method is "dimensional modeling" (not traditional ER modeling, I *do* know
about normalization :), and is often used for analytics databases ("build
once, query often").

Basically, a row in the sitesummary table is not a website in any way. It
stands for "all clusters that share the same coordinates in the search space".

The search space has 4 "dimensions" in this case:
- version: e.g. "<day>", "<week>", "4.0b3"
- positive: True, False, None (= "Both")
- url: "http://google.com", "https://google.com"
- platform: "vista", "mac", "maemo", "android" or "<mobile>" (= maemo+android)

The other fields in the table (the so called "facts") are just precomputed
aggregate values over the matching comments / clusters. Now, of course a
cluster can be at multiple coordinates in the search space (yesterday's
comments are latest beta as well as last 7 days, and happy cómments are always
also "Both"), so it is aggregated into multiple sitesummary records. For this
reason, 'version', 'platform' and 'positive' always need to be specified in queries (urls do not need that, as they are disjunct).

In the future, a "locale" dimension could be added (where the "all"
coordinate-component maps 1:1 to the current site summaries), which would
allow to also cluster by locale, and filter by them.

So the reliable way to link to a search result is to use the actual search
parameters for *all* dimensions (e.g. with four get parameters, maybe with
defaults). This will also yield the most useful (if not the same) results if
the link is shared and reused in the future, and it will make for readable
urls.
The SiteSummary.id is intended to link clusters to their search coordinates
when making SQL queries, not to reference these search coordinates from the
web. Everytime the clusters are regenerated (every night) the site summary ids
will expire.

17 changes: 10 additions & 7 deletions apps/website_issues/management/commands/generate_sites.py
Expand Up @@ -117,10 +117,6 @@ def handle(self, *args, **options):
% self.site_summary_id.next())

def collect_groups(self, err):
now = datetime.now()
seven_days_ago = now - timedelta(days=7)
one_day_ago = now - timedelta(days=1)
latest_versions = (LATEST_BETAS[FIREFOX], LATEST_BETAS[MOBILE])
err("Collecting groups...\n")
def add(opinion, **kwargs):
"""Add this opinion to it's summary group."""
Expand All @@ -139,17 +135,24 @@ def add_variants(opinion, **keypart):
add(opinion, os=app, positive=None, **keypart)


now = datetime.now()
seven_days_ago = now - timedelta(days=7)
one_day_ago = now - timedelta(days=1)
queryset = Opinion.objects.filter(
~Q(url__exact="") & (
Q(created__range=(seven_days_ago, now))
| Q(version__in=latest_versions)
| (Q(product__exact=FIREFOX.id) &
Q(version__exact=LATEST_BETAS[FIREFOX]))
| (Q(product__exact=MOBILE.id) &
Q(version__exact=LATEST_BETAS[MOBILE]))
)
).only("url", "version", "created", "positive", "os")
).only("url", "version", "created",
"positive", "os", "product")

i = 0
for i, opinion in enumerate(queryset):
site_url = normalize_url(opinion.url)
if opinion.version in latest_versions:
if opinion.version == LATEST_BETAS[APP_IDS[opinion.product]]:
add_variants(opinion, version=opinion.version, url=site_url)
if opinion.created > seven_days_ago:
add_variants(opinion, version="<week>", url=site_url)
Expand Down

0 comments on commit 2c003f5

Please sign in to comment.