# Google Play Store Apps

<img src= "https://i.ibb.co/LNmnk58/1-h-Dq-C5-H28uy-S3-Ic8m-Bm-Sul-Q.png">

### Neuro-linguistic programming (NLP) 

   is a psychological approach that involves analyzing strategies used by successful individuals and applying them to reach a personal goal. It relates thoughts, language, and patterns of behavior learned through experience to specific outcomes.

https://www.goodtherapy.org/learn-about-therapy/types/neuro-linguistic-programming

## Downloading the Dataset

In [1]:
import opendatasets as od
dataset_url = 'https://www.kaggle.com/lava18/google-play-store-apps'
od.download(dataset_url)

Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: passamon
Your Kaggle Key: ········
Downloading google-play-store-apps.zip to .\google-play-store-apps


 51%|█████▏    | 1.00M/1.94M [00:00<00:00, 3.77MB/s]




100%|██████████| 1.94M/1.94M [00:00<00:00, 5.41MB/s]


In [258]:
data_dir = './google-play-store-apps'

In [259]:
import os
os.listdir(data_dir)

['googleplaystore.csv', 'googleplaystore_user_reviews.csv', 'license.txt']

## Data Preparation and Cleaning

In [260]:
import pandas as pd
import numpy as np

In [261]:
googleplaystore_user_reviews = pd.read_csv('google-play-store-apps/googleplaystore_user_reviews.csv')
googleplaystore_user_reviews

Unnamed: 0,App,Translated_Review,Sentiment,Sentiment_Polarity,Sentiment_Subjectivity
0,10 Best Foods for You,I like eat delicious food. That's I'm cooking ...,Positive,1.00,0.533333
1,10 Best Foods for You,This help eating healthy exercise regular basis,Positive,0.25,0.288462
2,10 Best Foods for You,,,,
3,10 Best Foods for You,Works great especially going grocery store,Positive,0.40,0.875000
4,10 Best Foods for You,Best idea us,Positive,1.00,0.300000
...,...,...,...,...,...
64290,Houzz Interior Design Ideas,,,,
64291,Houzz Interior Design Ideas,,,,
64292,Houzz Interior Design Ideas,,,,
64293,Houzz Interior Design Ideas,,,,


In [262]:
apps = googleplaystore_user_reviews

 - Clean data
 - Replace the null with some word. In this case, I replace it with a blank word.

In [276]:
# apps.drop([2,7,15])
apps['Translated_Review'] = apps['Translated_Review'].fillna('')
apps['Sentiment'] = apps['Sentiment'].fillna('')
apps['Sentiment_Polarity'] = apps['Sentiment_Polarity'].fillna('')
apps['Sentiment_Subjectivity'] = apps['Sentiment_Subjectivity'].fillna('')

last_word = ''
initial_word_index = 0
index = 0
remove_index_list = []

for word in apps['App']:
    
    if word == last_word:
        apps['Translated_Review'][initial_word_index] = apps['Translated_Review'][initial_word_index] + apps['Translated_Review'][index]
        remove_index_list.append(index)
    else:
        initial_word_index = index
        
    last_word = word
    index = index + 1
    
apps = apps.drop(remove_index_list)
apps
        

Unnamed: 0,App,Translated_Review,Sentiment,Sentiment_Polarity,Sentiment_Subjectivity,text
0,10 Best Foods for You,I like eat delicious food. That's I'm cooking ...,Positive,1,0.533333,I like eat delicious food. That's I'm cooking ...
200,104 找工作 - 找工作 找打工 找兼職 履歷健檢 履歷診療室,"GreatniceAlmost mobile phoneVery effective, ef...",Positive,0.8,0.75,"GreatniceAlmost mobile phoneVery effective, ef..."
240,11st,Horrible ID verificationEasy even basic Korean...,Negative,-1,1,Horrible ID verificationEasy even basic Korean...
280,1800 Contacts - Lens Store,Great hassle free way order contacts. Got call...,Positive,0.6,0.775,Great hassle free way order contacts. Got call...
360,1LINE – One Line with One Touch,"gets 1* there's ad every single level restart,...",Negative,-0.157143,0.704762,"gets 1* there's ad every single level restart,..."
...,...,...,...,...,...,...
64076,Hotspot Shield Free VPN Proxy & Wi-Fi Security,7 days free trial asking credit card. Stupid!!...,Negative,-0.3,0.9,7 days free trial asking credit card. Stupid!!...
64116,Hotstar,runningBestYou great collection shows movies. ...,Neutral,0,0,runningBestYou great collection shows movies. ...
64156,Hotwire Hotel & Car Rental App,The worthless. It allow see information would ...,,,,The worthless. It allow see information would ...
64196,Housing-Real Estate & Property,Incorrect listings. The agents show property d...,Negative,-0.025,0.125,Incorrect listings. The agents show property d...


In [264]:
apps["text"] = apps["Translated_Review"]

# Train Model

<img src="https://i.ibb.co/KXY6D4k/NLP.jpg">

In [265]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel, cosine_similarity

tf = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(apps["text"].values.astype('U'))

cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

In [266]:
titles = apps["App"]
indices = pd.Series(apps.index, index=apps["App"])

In [267]:
def get_app_review(app_name):
    index = 0
    
    for word in apps['App']:
        if word == app_name:
            return [index]
        
        index = index + 1


def get_recommendations(app_name):
    id = get_app_review(app_name)
    sim_scores = []
    for idx in id:
        sim_scores = sim_scores + list(enumerate(cosine_sim[idx]))
        sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
        app_indices = [i[0] for i in sim_scores]
    return titles.iloc[app_indices][len(id):]


In [279]:
get_recommendations("Expedia Hotels, Flights & Car Rental Travel Deals").head(10)

63996    Hotels.com: Book Hotel Rooms & Find Vacation D...
63776     HotelTonight: Book amazing deals at great hotels
16767                             Booking.com Travel Deals
63956                        Hotels Combined - Cheap deals
25258          Cheap hotel deals and discounts — Hotellook
64156                       Hotwire Hotel & Car Rental App
4752                           Agoda – Hotel Booking Deals
54182     Goibibo - Flight Hotel Bus Car IRCTC Booking App
46878                   Flight & Hotel Booking App - ixigo
20567                                            CWT To Go
Name: App, dtype: object