Permalink
Browse files

Snapshot blogpost

  • Loading branch information...
1 parent 82befa7 commit 7cbf09f937dc59fe605aacde8668d3e1300ed70e @jirtob committed Jul 27, 2011
View
BIN .DS_Store
Binary file not shown.
@@ -0,0 +1,139 @@
+---
+title: Analyzing a Movement of Data with Snapshots
+excerpt: The article explains the usage of snapshot technique in SalesForce Analytics.
+layout: post
+---
+
+# {{ page.title }}
+
+Today we will focus on the snapshot technique. Don’t you know what we talk about? Let’s read this article and after that, you will know how snapshots can help you.
+
+**Analyzing SalesForce**
+
+What is the main purpose and motivation to use snapshots? Imagine that you analyze SalesForce. If you want to get useful insights from data you would need some tool that allows you to analyze trends.
+
+Sales Analytics is all about sales opportunities that are generated by your business activities. Every sales opportunity has the stage attribute that identifies the position of the opportunity in the business process. When an opportunity is created, the stage value is set to the `Interest`. After the whole sales process ends opportunity is in `Closed Won` or `Closed Lost` stage.
+
+Now you know how opportunities and it’s stages basically work. Let’s move on. We want to analyze how opportunity stages are changing or how many opportunities were in a specific stage in the beginning of a specific time period. You want to compare opportunities with history.
+
+And there is the problem. The problem is that when you change the opportunity stage, previous values are rewritten by new values. That means you need to somehow store the history of the data.
+
+**Snapshot Basics**
+
+In GoodData, this problem is solved by snapshots. A snapshot is complete data stored in GoodData and is updated weekly. Every opportunity is stored multiple times, every week with corresponding attributes.
+
+So, every Monday morning there is a new data snapshot and within it every sales opportunity is marked with SnapshotId - unique number that identifies the snapshot and with the SnapshotDate - identifies the date of the snapshot creation. These two facts are used in basic metrics.
+
+There is a couple of system metrics loaded with your data that use SnapshotID and SnapshotDate. These metrics are also used in more advanced metrics. Basic metrics are:
+
+`Snapshot [Most Recent]` - identifies the most recent (the last) data snapshot
+`Snapshot [First of Period]` - identifies the first snapshot of defined time period
+`Snapshot [Last of Period]` - identifies the last snapshot of defined time period
+
+Now, Let’s look at how these metrics are created. We will show it on the Snapshot [First of Period] metric example. We need to identify the SnapshotID of the first snapshot in the time period that we want to analyze.
+
+Here you can see this metric in MAQL:
+
+`SELECT MIN(SnapshotId) BY ALL IN ALL OTHER DIMENSIONS EXCEPT Date (Snapshot)`
+
+Let’s look at this MAQL statement more detail. Every metric is an aggregation of a fact. Anyway, you can specify how to aggregate a fact. We will analyze and decompose the metric above little bit more.
+
+`SELECT MIN(SnapshotID)...`
+
+This defines an aggregation; we are looking for the smallest value of the SnapshotID (fact).
+
+`...BY ALL IN ALL OTHER DIMENSIONS EXCEPT Date (Snapshot)`
+
+This statement defines the aggregation more detail. In this example it locks the value of the metric on the highest level of an aggregation in all dimensions but with one exception - Date (Snapshot). You simply want to compute the highest aggregation of all dimensions but you will need to slice and dice the metric by Date (Snapshot) attribute.
+
+You may better understand these concepts from following example with figures and differences shown in it.
+
+We will use `SELECT MIN(SnapshotID)` with various statements so that you’ll see the result and you can assume what is the difference. We’ll slice the metric by the Date (Quarter/Year) and by the SalesRep.
+
+NOTE: You can also check [the documentation](http://developer.gooddata.com/docs/maql.html) about basic aggregation techniques.
+
+**BY**
+
+This statement locks the value of the metric at the aggregation level specified by the attribute after the `BY` statement. Here is the metric and the resulting report:
+
+`SELECT MIN(SnapshotID) by Date (Snapshot)`
+
+<p>
+<center><img src="{{ site.root }}/images/posts/by-date.png" alt="Resulting Report with BY"></center>
+</p>
+
+As you can see on the picture above, minimum SnapshotID is different by individual SalesRep. The second SalesRep (Aucoin, Betty) on the report probably started to work during the Q1/2011 and the minimum SnapshotID is bigger then on the others.
+
+That means we need something to aggregate SnapshotID regardless differences in SalesRep dimension. We need an aggregation on higher level.
+
+**BY ALL IN ALL OTHER DIMENSIONS**
+
+This statement locks the value of the metric at the highest possible aggregation level across all dimensions. It returns grand total that is indivisible. It returns a constant. Check out the example on the picture below. The metric is:
+
+`SELECT MIN(SnapshotID) BY ALL IN ALL OTHER DIMENSIONS`
+
+<p>
+<center><img src="{{ site.root }}/images/posts/by-all-in-all.png" alt="Resulting Report with BY ALL IN ALL"></center>
+</p>
+
+As you can see we again sliced the metric by the Date (Quarter/Year) and by the SalesRep attributes. As we have mentioned above that returns aggregation which is the minimum of all the time regardless all dimensions and it is constant.
+
+We need something that returns the total minimum of the SnapshotID, but only for specific time period. Let’s follow on another expression.
+
+**BY ALL IN ALL OTHER DIMENSIONS EXCEPT**
+
+Adding `EXCEPT attribute` to the `BY ALL IN ALL OTHER DIMENSIONS` clause specifies an exception where the metric will be sliced and diced by the specified attribute (and its hierarchy) if the attribute is contained in the report. So, in our example
+
+`SELECT MIN(SnapshotID) BY ALL IN ALL OTHER DIMENSIONS EXCEPT Date (Snapshot)`
+
+<p>
+<center><img src="{{ site.root }}/images/posts/by-all-in-all-except.png" alt="Resulting Report with BY ALL IN ALL EXCEPT"></center>
+</p>
+
+This is what we were looking for. We have the total minimum for every time period regardless other dimensions. You can compare this result with the previous `BY` example. Same numbers for the second SalesRep as for others.
+
+So, we have aggregation that identifies the minimum SnapshotID for a specific time period and we are ready to create useful report using it!
+
+**Using Snapshots in Report**
+
+We know how snapshots basically work and will move on to an example. Let’s go on:
+
+We want to find out HOW many Opportunities were in the `Interest` stage in the beginning of Q1/2011 and how many opportunities were in the same stage in the end of this period. We will need metrics that were described above - `Snapshot [First of Period]` and `Snapshot [Last of Period]` to exactly identify the data. Let’s define report.
+
+Firstly we need to define the `WHAT` - metrics. In our report we will use two metrics:
+
+`Oppty [Count of All First Snapshot]`
+`Oppty [Count of All Last Snapshot]`
+
+Let’s describe them. First metric:
+
+`Oppty [Count of All First Snapshot]`
+
+is defined in MAQL as:
+
+`SELECT COUNT(OpportunityId,OpportunitySnapshotId) WHERE SnapshotId=Snapshot [First of Period]`
+
+The second metric is:
+
+`Oppty [Count of All Last Snapshot]`
+
+and in MAQL:
+
+`SELECT COUNT(OpportunityId,OpportunitySnapshotId) WHERE SnapshotId=Snapshot [Last of Period]`
+
+As you can see the aggregation is simple. We want to compute a count of opportunities. Only difference between these two metrics is in an expression after the WHERE statement. That identifies the snapshot as we have described it earlier.
+
+Now, when we defined metrics, we need to slice and dice them by an attribute. As we have defined, we look for opportunities in the `Interest` stage which is the Stage = 1. Second attribute will be Quarter/Year = Q1/2011 because we are interested in this quarter and we want to see the movement of opportunities in it.
+
+So, we define the `HOW` - attributes and apply filters:
+
+Stage is 1
+Quarter/Year = Q1/2011
+
+Now, check out the result. It’s simple but powerful because you are able to see the movement of sales opportunities. First number gives you a number of all opportunities that were in the Interest stage on the beginning of Q1/2011 and the second number gives you the same metric in the end of that time period. (Movement!)
+
+<p>
+<center><img src="{{ site.root }}/images/posts/report-movement.png" alt="Opportunities movement"></center>
+</p>
+
+Definitely this is just a simple example. You can create more interesting and useful reports based on these principles!
View
Binary file not shown.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 7cbf09f

Please sign in to comment.