## Problem statement: Help startup founders analyse their Play Store app submissions in order to find ways to improve probability of success

## Introduction to the problem

For many founders, their app is a core part of their offering. 

However, I have seen with the founders I've worked with that, the app store/Play store submission is often an after thought at the end of a development sprint. 

Founders may find themselves second guessing:
* Which category should I be in? 
* Which of these three titles should I choose?
* What should I include in my description?

These are some of the questions we aim to demystify in this research exercise. 

<img src="https://instabug.com/blog/wp-content/uploads/2017/07/appinfo.png" alt="Play Store submission" style="width: 600px;"/>

### What is a successful app?

* An app that has impact? 
* An app that lot's of people download?
* An app that makes a lot of money?
* An app that some people love?

This is not clear cut at all, but to simplify the exercise and to make it more quantifiable, we will focus on number of installations and the average user rating.

This should provide us with a push-pull set of metrics: a hybrid approximation of reach and retention. 

The exact approach to this could take many forms but we will start with: 

* TO DO

### Apart from the quality of the appliation/importance of our problem, what factors contribute to success? I.e. how can we optimise our app submission to maximise chance of success?

[This article from Apple](https://developer.apple.com/app-store/product-page/) provides some useful guidance on a successful app store submission. I've highlighted some of them below 

#### App name

"our app’s name plays a critical role in how users discover it on the App Store. Choose a simple, memorable name that is easy to spell and hints at what your app does. Be distinctive. Avoid names that use generic terms or are too similar to existing app names. An app name can be up to 30 characters long."


#### Description 

"Provide an engaging description that highlights the features and functionality of your app. The ideal description is a concise, informative paragraph followed by a short list of main features. Let potential users know what makes your app unique and why they will love it. Communicate in the tone of your brand, and use terminology your target audience will appreciate and understand. The first sentence of your description is the most important — this is what users can read without having to tap to read more. Every word counts, so focus on your app’s unique features.

If you choose to mention an accolade, we recommend putting it at the end of your description or as part of your promotional text. Don’t add unnecessary keywords to your description in an attempt to improve search results. Also avoid including specific prices in your app description. Pricing is already shown on the product page, and references within the description may not be accurate in all countries and regions."

#### Keywords
"Keywords help determine where your app displays in search results, so choose them carefully to ensure your app is easily discoverable. Choose keywords based on words you think your audience will use to find an app like yours. Be specific when describing your app’s features and functionality to help the search algorithm surface your app in relevant searches. Consider the trade-off between ranking well for less common terms versus ranking lower for popular terms. Popular, functional terms such as “jobs”, “text”, or “social” may drive a lot of traffic, but are highly competitive in the rankings. Less common terms drive lower traffic, but are less competitive." 

While we may not have keywords data directly, we may indirectly look for apps that avoid generic keywords in their descriptions.


#### Category selection

"Be sure to select the primary category that is most relevant. Choosing categories that are not relevant to your app may cause your app to be rejected when submitted for review."

## What data do we have?

### App Store dataset 
This data set contains more than 7000 Apple iOS mobile application details from July 2017.

In [88]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns

In [89]:
app_metrics = pd.read_csv('../data/AppleStore.csv')
app_description_data = pd.read_csv('../data/appleStore_description.csv')
# Add description into same dataframe
app_metrics['description'] = app_description_data['app_desc']
pd.set_option('display.max_columns', None)
app_metrics.head(3)

Unnamed: 0.1,Unnamed: 0,id,track_name,size_bytes,currency,price,rating_count_tot,rating_count_ver,user_rating,user_rating_ver,ver,cont_rating,prime_genre,sup_devices.num,ipadSc_urls.num,lang.num,vpp_lic,description
0,1,281656475,PAC-MAN Premium,100788224,USD,3.99,21292,26,4.0,4.5,6.3.5,4+,Games,38,5,10,1,"SAVE 20%, now only $3.99 for a limited time!\n\nOne of the most popular video games in arcade history!\n2015 World Video Game Hall of Fame Inductee\n\nWho can forget the countless hours and quarters spent outrunning pesky ghosts and chompin’ on dots? Now you can have the same arcade excitement on your mobile devices! \nGuide PAC-MAN through the mazes with easy swipe controls, a MFi controller, or kick it old school with the onscreen joystick!\nEat all of the dots to advance to the next stage. Go for high scores and higher levels! Gain an extra life at 10.000 points! Gobble Power Pellets to weaken ghosts temporarily and eat them up before they change back. Avoid Blinky, the leader of the ghosts, and his fellow ghosts Pinky, Inky, and Clyde, or you will lose a life. It’s game over when you lose all your lives.\n\n9 NEW MAZES Included!!!\nThe game includes 9 new mazes in addition to the pixel for pixel recreation of the classic original maze. Challenge your skill to beat them all! We are constantly updating the game with new maze packs that you can buy to complete your PAC-MAN collection.\n\nHINTS and TIPS!!!\nInsider pro-tips and hints are being made available for the first time in-game! Use these to help you become a PAC-MAN champion!\n\nFEATURES:\n• New tournaments\n• New Visual Hints and Pro-tips\n• New mazes for all new challenges\n• Play an arcade perfect port of classic PAC-MAN\n• Two different control modes\n• Three game difficulties (including the original 1980 arcade game)\n• Retina display support\n• MFi controller support"
1,2,281796108,Evernote - stay organized,158578688,USD,0.0,161065,26,4.0,3.5,8.2.2,4+,Productivity,37,5,23,1,"Let Evernote change the way you organize your personal and professional projects. Dive in: take notes, create to-do lists, and save things you find online into Evernote. We’ll sync everything between your phone, tablet, and computer automatically.\n\n---\n\n“Use Evernote as the place you put everything… Don’t ask yourself which device it’s on—it’s in Evernote” – The New York Times\n\n“When it comes to taking all manner of notes and getting work done, Evernote is an indispensable tool.” – PC Mag\n\n---\n\nGET ORGANIZED\nEvernote gives you the tools you need to keep your work effortlessly organized:\n• Write, collect and capture ideas as searchable notes, notebooks, checklists and to-do lists\n• Take notes in a variety of formats, including: text, sketches, photos, audio, video, PDFs, web clippings and more\n• Use the camera to effortlessly scan, digitize, and organize your paper documents, business cards, handwritten notes and drawings\n• Use Evernote as a digital notepad and easy-to-format word processor for all your thoughts as they come\n\nSYNC ANYWHERE\nEvernote gives you the ability to sync your content across devices:\n• Sync everything automatically across any computer, phone or tablet\n• Start your task working on one device and continue on another without ever missing a beat\n• Add a passcode lock to the mobile app for more privacy\n\nSHARE YOUR IDEAS\nEvernote gives you the tools to share, discuss and collaborate productively with others:\n• Create, share and discuss with the people who help get your work done, all in one app\n• Search within pictures and annotate images to give quick feedback\n• Develop your projects faster and let multiple participants work on different aspects\n\nEVERNOTE IN EVERYDAY LIFE\n• Make personal checklists to keep your thoughts organized\n• Set reminders to keep on top of activities and write to-do lists\n• Gather, capture and store every thought you need to stay productive\n• Plan events such as holidays, weddings or parties\n\nEVERNOTE IN BUSINESS\n• Create agendas, write memos and craft presentations\n• Annotate documents with comments and thoughts during team meetings, then share with colleagues\n• Get your projects underway faster and maximise productivity by letting multiple participants access and work on different aspects alongside each other\n\nEVERNOTE IN EDUCATION\n• Keep up with lecture notes so you don’t miss a vital thought\n• Clip and highlight articles from the web for academic research\n• Plan and collaborate for better academic group work\n\nBETTER NOTE INTERACTION WITH 3D TOUCH\n• Quick Actions for faster note creation and search\n• Sketch in notes with pressure sensitive ink\n\nEVERNOTE FOR APPLE WATCH\n• Dictate notes and they will be transcribed in Evernote\n• Dictate searches and get results on your Apple Watch\n• View newly created & updated notes\n• Set reminders, get notifications, and never forget anything\n\n---\n\nAlso available from Evernote:\n\nEVERNOTE PLUS - More space. More devices. More freedom.\n• 1 GB of new uploads each month\n• Unlimited number of devices\n• Access your notes and notebooks offline\n• Save emails to Evernote\n$3.99 monthly, $34.99 annually\n\nEVERNOTE PREMIUM - The ultimate workspace.\n• 10 GB of new uploads each month\n• Unlimited number of devices\n• Access your notes and notebooks offline\n• Save emails to Evernote\n• Search inside Office docs and attachments\n• Annotate PDFs\n• Scan and digitize business cards\n• Show notes as presentations, instantly (desktop only)\n$7.99 monthly, $69.99 annually\n\n---\n\nPrice may vary by location. Subscriptions will be charged to your credit card through your iTunes account. Your subscription will automatically renew unless canceled at least 24 hours before the end of the current period. You will not be able to cancel the subscription once activated. Manage your subscriptions in Account Settings after purchase.\n\n---\n\nPrivacy Policy: https://evernote.com/legal/privacy.php \nTerms of Service: https://evernote.com/legal/tos.php"
2,3,281940292,"WeatherBug - Local Weather, Radar, Maps, Alerts",100524032,USD,0.0,188583,2822,3.5,4.5,5.0.0,4+,Weather,37,5,3,1,"Download the most popular free weather app powered by the largest professional weather network in the world! Our weather network delivers the fastest alerts and the best real-time forecasts (current, hourly and 10-day).\n\n“I love WeatherBug! It’s always accurate & is the first place I go for up-to-the minute weather!” –iOS User\n\nIt's easy to use and has 18 different weather maps including Doppler radar, lightning, wind, temperature, alerts, pressure, and humidity. Join millions who rely on WeatherBug every day!\n\n\nTRACK ANY CONDITION\n• North American Doppler Radar: See Doppler radar across the United States, Canada, Mexico, Alaska & Hawaii\n• PulseRad® Radar: Get radar for many international locations using patented Earth Networks Total Lightning Network® technology.\n• Real-Time Pinpoint Forecasts: Get the most accurate current, hourly and 10-day weather forecasts\n• Enhanced Interactive Map: Visualize weather conditions with 18 weather maps\n• Spark Lightning Alerts: Your personal lightning detector, Spark gives you minute-by-minute, mile-by-mile lightning proximity alerts\n• Real-Time Traffic Conditions: View current traffic conditions to better plan your day\n• Apple Watch Support: Get vital weather information directly on your Apple Watch, including alerts, glances, and complications\n• Lifestyle Forecasts: Know how weather will impact your sports games, workouts, allergies, chronic pain and much more\n• Hurricane Center: Stay informed of all hurricane forecasts and changing conditions to protect you and your loved ones\n\n\nLARGEST WEATHER NETWORK\n• Forecasts for 2.6 million+ locations worldwide\n• Largest total lightning detection network\n• 10,000+ professional-grade weather stations\n• Live weather & traffic cameras\n\n\nFASTEST WEATHER ALERTS\n• Get notified of severe weather 50% faster with our Dangerous Thunderstorm Alerts\n• Receive all National Weather Service watches and warnings\n\n\nBe prepared. Know Before™. Download the app used and loved by millions, voted the “Best Weather App for iPads” by AppPicker, and “Best App for Moms” by Parent Magazine - WeatherBug!"


## Introducing a success metric


We used a modified version of this approach [Algorithm to calculate rating based on multiple reviews (using both review score and quantity)](https://math.stackexchange.com/questions/942738/algorithm-to-calculate-rating-based-on-multiple-reviews-using-both-review-score)

score=𝑃𝑝+10(1−𝑃)(1−𝑒<sup>−𝑞/𝑄</sup>))

"The choice of 𝑄 depends on what you call "few", "moderate", "many". As a rule of thumb consider a value 𝑀 that you consider "moderate" and take 𝑄=−𝑀/ln(1/2)≈1.44𝑀. So if you think 100 is a moderate value then take 𝑄=144."

We find that 300 is the median number of reviews so we say this is a moderate value. 

For P, using trial and error, we find 0.8 to be sensible. 

This leaves us with a success score ranging from 0-10 with a median score of 7.6.

In [90]:
Q = 300*1.44
# app_metrics['success_score'] = (app_metrics['user_rating']*2 + (5 * (1 - np.exp(-(app_metrics['rating_count_tot'] / Q))) ))/1.5
P = 0.8
app_metrics['success_score'] = (2 * P * app_metrics['user_rating']) + ((10 *(1 - P)) * (1 - np.exp(-(app_metrics['rating_count_tot'] / Q))))
app_metrics.head(50)[['rating_count_tot','user_rating','success_score']]
app_metrics['success_score'].describe()

count    7197.000000
mean        6.678456
std         2.961661
min         0.000000
25%         5.926518
50%         7.600000
75%         8.922349
max        10.000000
Name: success_score, dtype: float64

## Can we manually predict success score based on features?

Let's defined success as a success_score of 6 or more. 

I will select a random sample of 20 app submissions and make a prediction on the first 10 that are in English. 

I will count an app to be successful if it has: success_score >= 6.

Otherwise It is unsuccessful. 
* 6 or more 
* less than 6
using the title, description and category.

In [94]:
# Uncomment the below to see the samples I used
# sample_of_app_metrics = app_metrics.sample(n=20, random_state=42)
# sample_of_app_metrics[['track_name', 'description','prime_genre', 'success_score']]

### My manual predictions score: 50% accurate classification of whether an app is successful or not using title, description and category. 

Essentially I was no better at predicting success than the toss of a coin.

My logic didn't seem very sound either

| App id          | App name                    | Prediction | Actual    |        Reasoning                     |
| :-------------: |:---------------------------:| :--------: | :--------:| :-----------------------------------:|
| 3592	          | A Noble Circle              | Failure    | Success   | Short description, seems low effort  |
| 2178            | QR Code Reader by Scan      | Success    | Success   | scan.me url seems premium - invested in the app |
| 5944            | Cricket Captain 2016        | Success    | Failure   | Name dropped cricket captain, seemed like someone spent some work on this app                                      |
| 2112            | DEVICE 6                    | Success    | Success   | Winner of Apple design award - seems like a good app                                     |
| 6260            | Athlete Shave Salon Games   | Failure    | Success   | Seemed like not much effort gone into description - quite short                                     |
| 3570   | Weaphones Antiques: Firearms Simulator  | Sucess   | Success  | "From the creators of" made me think they know what they're doing                                     |
| 7000   | Our dark lord-Sasuyu 2-TAP RPG | Failure  |Success   |  Mixing Japanese and English - thought it would be low effort game                                  |
| 4094   | Witches' Legacy: The Dark Throne HD (Full) | Failure | Failure | started by saying "no in app purchases" - seemed defensive                                      |
| 3463   | My New Baby Story - Makeup Spa & Dressup Games |Failure | Success | Simply listing the levels in description - low effort                                     |
| 2375   | GoodReader - PDF Reader, Annotator and File Manager |Success | Success          | 4th edition and trademarked app       |

### Exploring correlations between features

In [None]:
correlations = app_metrics.corr()
correlations
sns.heatmap(correlations)