<a href="https://colab.research.google.com/github/kalpanibhagya/EmergencyAppsReviews/blob/datafileseparation/DataExtractionAndTranslation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Project 17: Analysis of Online Reviews of Emergency apps using NLP and Technology Acceptance Models (TAM).**

In this project, we collect online reviews from Apple and Google Play stores from two emergency apps(Emergency Plus , First Aid: American Red Cross), to understand how users react to them.

**Task 1 and Task 2 : Preprocessing**

**Important Note :**

**Run this file in Google Co labs.**

Because there might be issues come in when running in jupyter notebook. The code is perfectly working in co labs

Result csv files can be found in files folder in the left side bar.

You can find already extracted CSV files for each app from .data/Task 1/Result folder in project repository taken from github.

In [1]:
pip install google_play_scraper

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting google_play_scraper
  Downloading google_play_scraper-1.2.2-py3-none-any.whl (28 kB)
Installing collected packages: google-play-scraper
Successfully installed google-play-scraper-1.2.2


In [2]:
pip install app_store_scraper

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting app_store_scraper
  Downloading app_store_scraper-0.3.5-py3-none-any.whl (8.3 kB)
Installing collected packages: app-store-scraper
Successfully installed app-store-scraper-0.3.5


In [3]:
pip install -U deep-translator

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting deep-translator
  Downloading deep_translator-1.9.1-py3-none-any.whl (30 kB)
Collecting beautifulsoup4<5.0.0,>=4.9.1
  Downloading beautifulsoup4-4.11.1-py3-none-any.whl (128 kB)
[K     |████████████████████████████████| 128 kB 8.3 MB/s 
[?25hCollecting soupsieve>1.2
  Downloading soupsieve-2.3.2.post1-py3-none-any.whl (37 kB)
Installing collected packages: soupsieve, beautifulsoup4, deep-translator
  Attempting uninstall: beautifulsoup4
    Found existing installation: beautifulsoup4 4.6.3
    Uninstalling beautifulsoup4-4.6.3:
      Successfully uninstalled beautifulsoup4-4.6.3
Successfully installed beautifulsoup4-4.11.1 deep-translator-1.9.1 soupsieve-2.3.2.post1


In [5]:
from google_play_scraper import reviews_all
import pandas as pd
from deep_translator import GoogleTranslator
from app_store_scraper import AppStore
import numpy as np

def PrepareDataForTranslation(reviews_dataframe):
  global valueToFillForEmptyReviews
  valueToFillForEmptyReviews = "No review Comment"
  reviews_dataframe['Review'] = reviews_dataframe['Review'].fillna(valueToFillForEmptyReviews)
  reviews_dataframe['Review'].str.replace('|', '')

  splittedArray =[]
  characterLength =0
  arrayChunk =[]
  for i in range(len( reviews_dataframe['Review'])):
    characterLength = characterLength + len(reviews_dataframe['Review'][i])
                                            
    if characterLength > 3000 :
        splittedArray.append(arrayChunk)
        arrayChunk =[]
        characterLength= len(reviews_dataframe['Review'][i])
        arrayChunk.append(reviews_dataframe['Review'][i])
                             
    else:
      arrayChunk.append(reviews_dataframe['Review'][i])

    if i == len(reviews_dataframe['Review'])-1 :
       splittedArray.append(arrayChunk)
   
  return splittedArray


def TranslateReviewContent(splittedReviewList):
  translatedReviews =[]
  concatednatedReviewString=""
  for i in range(len(splittedReviewList)):
     concatednatedReviewString = "|".join(splittedReviewList[i])
     translatedValue = translator.translate(concatednatedReviewString, dest='en')
     splittedTranslatedContent= translatedValue.split("|")
     translatedReviews = translatedReviews + splittedTranslatedContent
  return translatedReviews


def ExtractAndTranslateGoogleReviews(app):
  googleReviewResult = reviews_all( app['googleId'], 
                       lang= app['googleLanguage'], 
                       country=app['googleCountry'], 
                       )

  googleAppReviewsdf = pd.DataFrame(googleReviewResult)
  googleAppReviewsdf.rename(columns = {'reviewId':'Review Id','userName':'User Name', 'content':'Review', 'score':'Rating', 'at':'Date of Review'}, inplace = True)
  splittedReviewList = PrepareDataForTranslation(googleAppReviewsdf)
  googleAppReviewsdf['Review'] = TranslateReviewContent(splittedReviewList);
  googleAppReviewsdf['Review'].replace([valueToFillForEmptyReviews], '')
  return googleAppReviewsdf

def ExtractAndTranslateAppleReviews(app):
  appleApp = AppStore(app_name=app['appStoreName'], app_id = app['appleAppId'], country=app['appleCountry'])
  appleApp.review(how_many=2000)
  appleAppReviewsdf = pd.DataFrame(appleApp.reviews)
  appleAppReviewsdf.dropna(how='all')
  appleAppReviewsdf.rename(columns = {'userName':'User Name', 'review':'Review', 'rating':'Rating', 'date':'Date of Review'}, inplace = True)
  splittedAppleReviewList = PrepareDataForTranslation(appleAppReviewsdf)
  appleAppReviewsdf['Review'] = TranslateReviewContent(splittedAppleReviewList);
  appleAppReviewsdf['Review'].replace([valueToFillForEmptyReviews], '')
  return appleAppReviewsdf





emergencyApps = [{'appName': 'Emergency Plus','googleId': 'com.threesixtyentertainment.nesn', 'googleLanguage': 'en', 'googleCountry': 'us', 
                                    'appStoreName':'emergency-plus','appleAppId' : '691814685', 'appleCountry': 'au'},
                  {'appName': 'Red Cross First Aid','googleId': 'com.cube.arc.fa', 'googleLanguage': 'en', 'googleCountry': 'us', 
                                    'appStoreName':'first-aid-american-red-cross','appleAppId' : '529160691', 'appleCountry': 'us'}
]

headers = ["Review Id", "User Name" ,"Review","Rating", "Date of Review", "reviewCreatedVersion"] 
translator = GoogleTranslator(source='auto', target='en')

for app in emergencyApps: 
  
  googleAppReviewsdf = ExtractAndTranslateGoogleReviews(app)
  appleAppReviewsdf = ExtractAndTranslateAppleReviews(app)

  combinedReviewsdf = pd.concat([googleAppReviewsdf, appleAppReviewsdf])
  combinedReviewsdf['Date of Review'] = combinedReviewsdf['Date of Review'].dt.date
  combinedReviewsdf.to_csv(app['appName']+'.csv', index=None, columns = headers, header=True)

  print(app['appName']+".csv file is created sucessfully. Check the files folder in left side bar. If you can't see yet refresh the folder.")

  # This is added back by InteractiveShellApp.init_path()


Emergency Plus.csv file is created sucessfully. Check the files folder in left side bar. If you can't see yet refresh the folder.
Red Cross First Aid.csv file is created sucessfully. Check the files folder in left side bar. If you can't see yet refresh the folder.
