# Investigating a decline in Yammer users with SQL and Excel

Yammer is a social network for communicating with coworkers by doing things like sharing documents, updates, and posting ideas in groups. 

While arriving to work today, our boss asked us to query the last few months of Yammer's weekly user activity. Here's what we find:

![Screenshot%20%285%29.png](attachment:Screenshot%20%285%29.png)

Our query shows a decline in user engagement beginning on the week of 28July. Let's use some SQL to pull relevant data from our database and figure out what the problem is. 

But first, we need to define a our metric of "weekly active users" seen above.

#### Yammer defines engagement as having made some type of server call by interacting with the product. These events include login events, messaging events, search events, events logged as users progress through a signup funnel, events around received emails.

# Step 1: Get familiar with our data

Our first table keeps track of user data

![table%201.png](attachment:table%201.png)

While our other two tables keep track of event information

![table%202.png](attachment:table%202.png)

![table%203.png](attachment:table%203.png)

# Step 2: Form some hypotheses

Prior to looking at our data, we should brainstorm some ideas about what could be causing the problem. Here are some posibilities for our case:

 - Decrease in new users signing up
 - Decrease in usage by 1 or 2 large companies
 - Decrease in users retention is correlated to frequency of weekly email blasts
 - The decrease in users is specific to a certain geographic location affected by a non-working holiday or a natural disaster

We also have some possibilities that we can't measure, like:

 - Decrease in marketing/marketing budget
 - Competing product/service is stealing our users

# Step 3: Try to disprove our hypotheses

#### Hypothesis #1: let's see if our sharp drop in users is correlated to new user sign-ups. 

![Yammer%20new%20users.png](attachment:Yammer%20new%20users.png)

![new%20users%20excel.png](attachment:new%20users%20excel.png)

After querying each week and formatting in excel, we see that the week of 28July had a new all-time-high of new user sign-ups. Meaning our decline in weekly users isn't coming from a decrease in new user events. 

#### Hypothesis #2: Let's see if our decrease in users is due to 1 or 2 companies that make up a significant portion of our total users

![user%20engagement%20sql%20query.png](attachment:user%20engagement%20sql%20query.png)

![user%20engagement.png](attachment:user%20engagement.png)

On the week of our decline, 28July (highlighted in blue), the company with ID number 2 had a 42% decrease in users. With Company2's lost users being higher than the total users of any of the next largest companies, it's likely responsible for our sharp drop in users on the week of 28July. 

##### But what could have caused this? 
Let's see if it is due to a geographical reason

#### Hypothesis #3: Localized geographical event (Holiday/Natural Disaster)

![company%202%20users%20by%20location%20SQL.png](attachment:company%202%20users%20by%20location%20SQL.png)

![company%202%20users%20by%20location%20excel-2.png](attachment:company%202%20users%20by%20location%20excel-2.png)

Excluding the outliers of Germany and Taiwan, company2 has seen a decrease in users from all across the globe (North America, Europe, Asia, and South America). This rules out our hypothesis of "a decline in users due to non-work holidays or a natural disaster in an area with a high concentration of users from company2"

But since we know company2 has had a sharp decrease in users, let's see what parts of our app they are using

![company%202%20event_names.png](attachment:company%202%20event_names.png)

![company%202%20event_names%20excel.png](attachment:company%202%20event_names%20excel.png)

On the week of 28July, all of our events took a hit, especially events relating to the search functionality. And although the non-search events were roughly the same in the week following the crash AND the number of searches increased by 50%, our search clicks got crushed for a 2nd consecutive week. 

So this data brings us to the conclusion that something happened on the week of 28July that caused a decrease in company2’s employees using our product. We can also deduce that company 2’s users do not like the results they’re getting when they run a search. And that this likely began around 04August, given that our searches have increased by 50%, but our clicks per search has dropped significantly in proportion to our other events for the same week. 

Before we wrap up our report, let's check if our weekly email blasts have anything to do with the decline in users.

#### Hypothesis #4: Decline in users is due to frequency or quality of email blasts

![all%20email%20events%20SQL.png](attachment:all%20email%20events%20SQL.png)

![all%20email%20events%20excel-2.png](attachment:all%20email%20events%20excel-2.png)

And after scrubbing the data around our event decline, we see that we maintained a 2-3 percent increase in weekly digest emails sent, both before and after the decline. 

But what really jumps out is that EVEN THOUGH the week our decline began (28July) had less total events, the email clickthroughs (CTR) and number of opened emails went up (top peach colored cells). 

With that information alone, we could suspect a negative correlation between email events and weekly users, however the following week also had a decline in weekly users, but the email events decreased as well, meaning there’s likely no negative correlation between email events and weekly users.

On another note, our click-through-rate for company2 specifically fell sharply on the week of our decline (bottom peach colored cells). This might contribute to that weeks decline in weely users. However, because our number of emails opened and CTR began improving again the following week, the issue might just be localized to that week’s specific weekly digest email, or could be cause by the same thing that cause the 40% drop in company2’s users that week.

# Step 4: Our Conclusions

1) Considering we hit a new ATH for new users created on the week of 28July, and that most of our events come from established user engagement, It’s unlikely that the sharp decline is correlated with a drop in new user sign-ups.

2) The decline in weekly active users beginning on 28July could be due to multiple factors:
 - On the week of 28July, our 2nd largest user, Company with ID# 2, saw an overall decrease in events of 42% from the week prior. This decrease percentage was roughly the same across the Company 2’s top 10 contributing countries, and is therefore unlikely related to a geographical event like a localized holiday or natural disaster.
 - Beginning 28 July, when measuring Company2’s general events, or events that aren’t related to signing up, the decrease in events related to Yammer’s search functionality was 22% higher than the decrease in events not related to yammers search functionality.
 - This observation is even stronger the following week, with the average decrease in search functionality related events being 48% more than the ones unrelated. This could mean that the search results being provided to company2 are not useful, or irrelevant. 
 
3) Lastly, on average, our click-through-rate for our weekly emails that get opened have been decreasing across all companies since 21 July. This might mean our emails are becoming less and less interesting or relevant to users.


# Step 5: Our Recommendations

1) I recommend we reach out to users at company2 to get feedback on our apps search functionality and see if they’ve started using a competitor's product.

2) If we’ve recently changed some code in our search functionality or information in user profiles for company2, it might be worth changing them back until we figure out why the new changes are unpopular among company2's users.

3) Lastly, along with taking a close look at 28July’s unpopular weekly digest email, we could implement some A/B testing for our weekly digest emails to improve our % of opened emails and average click through rate.

##### This practice problem was taken from Mode.com, and the user data has been edited to protect the privacy of Yammer and its users.