# Project FOCS - _Google Play Store Dataset_

Importazione delle librerie usate all'interno del progetto

In [1]:
import pandas as pd
import numpy as np
import re

Importazione e veloce esplorazione dei dataset su cui verranno eseguite le richieste

In [2]:
data_ps = pd.read_csv('files/googleplaystore.csv')
data_ps.head(3)

Unnamed: 0,App,Category,Rating,Reviews,Size,Installs,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,19M,"10,000+",Free,0,Everyone,Art & Design,"January 7, 2018",1.0.0,4.0.3 and up
1,Coloring book moana,ART_AND_DESIGN,3.9,967,14M,"500,000+",Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",2.0.0,4.0.3 and up
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,8.7M,"5,000,000+",Free,0,Everyone,Art & Design,"August 1, 2018",1.2.4,4.0.3 and up


In [3]:
data_ur = pd.read_csv('files/googleplaystore_user_reviews.csv')
data_ur.head(3)

Unnamed: 0,App,Translated_Review,Sentiment,Sentiment_Polarity,Sentiment_Subjectivity
0,10 Best Foods for You,I like eat delicious food. That's I'm cooking ...,Positive,1.0,0.533333
1,10 Best Foods for You,This help eating healthy exercise regular basis,Positive,0.25,0.288462
2,10 Best Foods for You,,,,


In [4]:
#controllo il dtype degli attributi
data_ps.dtypes

App                object
Category           object
Rating            float64
Reviews            object
Size               object
Installs           object
Type               object
Price              object
Content Rating     object
Genres             object
Last Updated       object
Current Ver        object
Android Ver        object
dtype: object

In [5]:
data_ur.dtypes

App                        object
Translated_Review          object
Sentiment                  object
Sentiment_Polarity        float64
Sentiment_Subjectivity    float64
dtype: object

## _1. Convert the app sizes to a number_

Onde evitare problemi con la riga che ha come valore in size 'Varies with device', lo converto in forma numerica con un nunmero che sono sicuro non verrà gestito dalla funzione creata

In [6]:
data_ps['Size'] = [re.sub('Varies with device', '999999999', size) for size in data_ps['Size']] 
data_ps['Size'] = [re.sub(',', '.', size) for size in data_ps['Size']] #sostituisco la virgola con un punto


In [7]:
pattern = re.compile('(?P<number>\d*\.*\d*)(?P<unit>\w*\+*)')

def convert(unit):
    if unit == 'G':
        return 1000000000
    if unit == 'M':
        return 1000000
    if unit == 'k':
        return 1000
    return 1

def to_numeric(elem):
    m = pattern.search(elem)
    unit = m.group('unit')
    mult = convert(unit)
    num = float(m.group('number'))
    return int(num * mult)


In [8]:
data_ps['ValueSize'] = data_ps['Size'].apply(to_numeric)
#list comprehension per avere attributi target in fondo al dataframe
data_ps = data_ps[[c for c in data_ps if c not in ['Size','ValueSize']] + ['Size','ValueSize']]


Riporto 'Varies with device' nella sua forma originale

In [9]:
data_ps['Size'] = [re.sub('999999999', 'Varies with device', size) for size in data_ps['Size']]
data_ps['ValueSize'] = data_ps['ValueSize'].apply(str)
data_ps['ValueSize'] = [re.sub('999999999', 'Varies with device', size) for size in data_ps['ValueSize']]
data_ps.head(3)

Unnamed: 0,App,Category,Rating,Reviews,Installs,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver,Size,ValueSize
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,"10,000+",Free,0,Everyone,Art & Design,"January 7, 2018",1.0.0,4.0.3 and up,19M,19000000
1,Coloring book moana,ART_AND_DESIGN,3.9,967,"500,000+",Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",2.0.0,4.0.3 and up,14M,14000000
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,"5,000,000+",Free,0,Everyone,Art & Design,"August 1, 2018",1.2.4,4.0.3 and up,8.7M,8700000


Finchè non decido come gestire il record che presenta 'Varies with device' non posso convertire l'attributo ValuSize da object a int/float (più avanti verrà eseguito)

## _2. Convert the number of installs to a number_

In [10]:
data_ps.dtypes

App                object
Category           object
Rating            float64
Reviews            object
Installs           object
Type               object
Price              object
Content Rating     object
Genres             object
Last Updated       object
Current Ver        object
Android Ver        object
Size               object
ValueSize          object
dtype: object

In [11]:
#controllo quali valori dovrò gestire
data_ps.groupby('Installs')['Installs'].size()

Installs
0                    1
0+                  14
1+                  67
1,000+             907
1,000,000+        1579
1,000,000,000+      58
10+                386
10,000+           1054
10,000,000+       1252
100+               719
100,000+          1169
100,000,000+       409
5+                  82
5,000+             477
5,000,000+         752
50+                205
50,000+            479
50,000,000+        289
500+               330
500,000+           539
500,000,000+        72
Free                 1
Name: Installs, dtype: int64

In [12]:
#verifico valore anomalo 'Free'
data_ps[data_ps['Installs'].str.contains('Free')]

Unnamed: 0,App,Category,Rating,Reviews,Installs,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver,Size,ValueSize
10472,Life Made WI-Fi Touchscreen Photo Frame,1.9,19.0,3.0M,Free,0,Everyone,,"February 11, 2018",1.0.19,4.0 and up,,1.000+,1


Essendo completamente una riga errata, con valori sfasati (probabilmente qualche errore in fase di acquisizione dati), procedo eliminando il record

In [13]:
data_ps = data_ps.drop(10472)
#re-check
data_ps.groupby('Installs')['Installs'].size()

Installs
0                    1
0+                  14
1+                  67
1,000+             907
1,000,000+        1579
1,000,000,000+      58
10+                386
10,000+           1054
10,000,000+       1252
100+               719
100,000+          1169
100,000,000+       409
5+                  82
5,000+             477
5,000,000+         752
50+                205
50,000+            479
50,000,000+        289
500+               330
500,000+           539
500,000,000+        72
Name: Installs, dtype: int64

In [14]:
data_ps['N_Installs'] = [re.sub('\\D', '', number) for number in data_ps['Installs']] #rimuovo tutti i caratteri
data_ps['N_Installs'] = data_ps['N_Installs'].apply(int) #converto a interi
data_ps['N_Installs'].dtypes #check

dtype('int64')

In [15]:
#list comprehension per avere attributi target in fondo al dataframe
data_ps = data_ps[[c for c in data_ps if c not in ['Installs','N_Installs']] + ['Installs','N_Installs']]
data_ps.head(3)

Unnamed: 0,App,Category,Rating,Reviews,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver,Size,ValueSize,Installs,N_Installs
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,Free,0,Everyone,Art & Design,"January 7, 2018",1.0.0,4.0.3 and up,19M,19000000,"10,000+",10000
1,Coloring book moana,ART_AND_DESIGN,3.9,967,Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",2.0.0,4.0.3 and up,14M,14000000,"500,000+",500000
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,Free,0,Everyone,Art & Design,"August 1, 2018",1.2.4,4.0.3 and up,8.7M,8700000,"5,000,000+",5000000


## _3. Transform “Varies with device” into a missing value_

In [16]:
data_ps = data_ps.replace('Varies with device', np.nan)

Avendo gestito ora la voce 'Varies with device' tramite nan, ora posso convertire l'attributo ValueSize in float

In [17]:
data_ps['ValueSize'] = data_ps['ValueSize'].apply(float) #converto in int
data_ps['ValueSize'].dtypes #Check

dtype('float64')

## _4. Convert Current Ver and Android Ver into a dotted number (e.g. 4.0.3 or 4.2)_

In [18]:
data_ps.head(3)

Unnamed: 0,App,Category,Rating,Reviews,Type,Price,Content Rating,Genres,Last Updated,Current Ver,Android Ver,Size,ValueSize,Installs,N_Installs
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,Free,0,Everyone,Art & Design,"January 7, 2018",1.0.0,4.0.3 and up,19M,19000000.0,"10,000+",10000
1,Coloring book moana,ART_AND_DESIGN,3.9,967,Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",2.0.0,4.0.3 and up,14M,14000000.0,"500,000+",500000
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,Free,0,Everyone,Art & Design,"August 1, 2018",1.2.4,4.0.3 and up,8.7M,8700000.0,"5,000,000+",5000000


Vado a rimuovere sia per 'Current Ver' che per 'Android Ver' tutti i caratteri non numerici

In [19]:
data_ps['Current Ver_fix'] = data_ps['Current Ver']
data_ps['Current Ver_fix'].replace('[a-z]+|[A-z]+','', regex = True, inplace = True)
data_ps['Android Ver_fix'] = data_ps['Android Ver']
data_ps['Android Ver_fix'].replace('[a-z]+|[A-z]+','', regex = True, inplace = True)

In [20]:
#list comprehension per avere attributi target in fondo al dataframe
data_ps = data_ps[[c for c in data_ps if c not in ['Current Ver','Current Ver_fix','Android Ver','Android Ver_fix']] 
                                                      + ['Current Ver','Current Ver_fix','Android Ver','Android Ver_fix']]
data_ps.head(3)

Unnamed: 0,App,Category,Rating,Reviews,Type,Price,Content Rating,Genres,Last Updated,Size,ValueSize,Installs,N_Installs,Current Ver,Current Ver_fix,Android Ver,Android Ver_fix
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,Free,0,Everyone,Art & Design,"January 7, 2018",19M,19000000.0,"10,000+",10000,1.0.0,1.0.0,4.0.3 and up,4.0.3
1,Coloring book moana,ART_AND_DESIGN,3.9,967,Free,0,Everyone,Art & Design;Pretend Play,"January 15, 2018",14M,14000000.0,"500,000+",500000,2.0.0,2.0.0,4.0.3 and up,4.0.3
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,Free,0,Everyone,Art & Design,"August 1, 2018",8.7M,8700000.0,"5,000,000+",5000000,1.2.4,1.2.4,4.0.3 and up,4.0.3


## _5. Remove the duplicates_

In questa parte bisogna decidere quale discriminante utilizzare per poter selezionare il duplicato più significativo. Ho scelto così di utilizzare il numero di reviews di un'app per determinare quale dei duplicati è il più recente (quella con reviews maggiore)

Controllo inizialmente se vi sono record interamente identici

In [21]:
len(data_ps)

10840

In [22]:
data_ps.drop_duplicates(inplace=True)
len(data_ps)

10357

In [23]:
data_ps['Reviews'].dtypes #Check 

dtype('O')

In [24]:
data_ps['Reviews'] = pd.to_numeric(data_ps['Reviews']) #converto a numerico
data_ps['Reviews'].dtypes #Check

dtype('int64')

Trovo un'app target per controllare se il metodo della reviews maggiore può funzionare (Telegram)

In [25]:
data_ps[data_ps['App'] == 'Telegram'][['App','Reviews']] 

Unnamed: 0,App,Reviews
370,Telegram,3128250
392,Telegram,3128509
4592,Telegram,3128611


Ora procedo quindi ad ordinare il dataframe in ordine crescente, ed in seguito andrò a tenere solo l'ultima app dei duplicati, quella con il più alto numero di reviews

In [26]:
data_ps = data_ps.sort_values('Reviews')
data_ps[data_ps['App'] == 'Telegram'][['App','Reviews']] #check

Unnamed: 0,App,Reviews
370,Telegram,3128250
392,Telegram,3128509
4592,Telegram,3128611


In [27]:
data_ps.drop_duplicates(subset = 'App', 
                        keep = 'last', inplace = True) #tolgo i duplicati in base al nome, tenendo l'ultimo
data_ps[data_ps['App'] == 'Telegram'][['App','Reviews']] #check

Unnamed: 0,App,Reviews
4592,Telegram,3128611


In [28]:
len(data_ps)

9659

## _6. For each category, compute the number of apps_

Prima di proseguire creo una copia del df in modo da avere solo gli attributi fixati negli esercizi precedenti

In [29]:
data_ps_fix = data_ps.drop(['Size', 'Installs', 'Current Ver', 'Android Ver'], axis = 1)
data_ps_fix = data_ps_fix.sort_index()
data_ps_fix.head(3)

Unnamed: 0,App,Category,Rating,Reviews,Type,Price,Content Rating,Genres,Last Updated,ValueSize,N_Installs,Current Ver_fix,Android Ver_fix
0,Photo Editor & Candy Camera & Grid & ScrapBook,ART_AND_DESIGN,4.1,159,Free,0,Everyone,Art & Design,"January 7, 2018",19000000.0,10000,1.0.0,4.0.3
2,"U Launcher Lite – FREE Live Cool Themes, Hide ...",ART_AND_DESIGN,4.7,87510,Free,0,Everyone,Art & Design,"August 1, 2018",8700000.0,5000000,1.2.4,4.0.3
3,Sketch - Draw & Paint,ART_AND_DESIGN,4.5,215644,Free,0,Teen,Art & Design,"June 8, 2018",25000000.0,50000000,,4.2


In [30]:
#riprendo esercizio
data_ps_fix.groupby('Category').size().sort_values(ascending=False)

Category
FAMILY                 1875
GAME                    946
TOOLS                   829
BUSINESS                420
MEDICAL                 395
PERSONALIZATION         376
PRODUCTIVITY            374
LIFESTYLE               369
FINANCE                 345
SPORTS                  325
COMMUNICATION           315
HEALTH_AND_FITNESS      288
PHOTOGRAPHY             281
NEWS_AND_MAGAZINES      254
SOCIAL                  239
BOOKS_AND_REFERENCE     222
TRAVEL_AND_LOCAL        219
SHOPPING                202
DATING                  170
VIDEO_PLAYERS           164
MAPS_AND_NAVIGATION     131
FOOD_AND_DRINK          112
EDUCATION               106
ENTERTAINMENT            87
AUTO_AND_VEHICLES        85
LIBRARIES_AND_DEMO       84
WEATHER                  79
HOUSE_AND_HOME           73
EVENTS                   64
ART_AND_DESIGN           61
PARENTING                60
COMICS                   56
BEAUTY                   53
dtype: int64

## _7. For each category, compute the average rating_

In [31]:
data_ps_fix.groupby('Category')[['Rating']].mean().sort_values(by = 'Rating', ascending = False)

Unnamed: 0_level_0,Rating
Category,Unnamed: 1_level_1
EVENTS,4.435556
ART_AND_DESIGN,4.359322
EDUCATION,4.351429
BOOKS_AND_REFERENCE,4.34497
PERSONALIZATION,4.332215
PARENTING,4.3
BEAUTY,4.278571
SOCIAL,4.247291
GAME,4.244605
WEATHER,4.243056


## _8. Create two dataframes: one for the genres and one bridging apps and genders. So that, for instance, the app Pixel Draw - Number Art Coloring Book appears twice in the bridging table, once for Art & Design, once for Creativity_

In [32]:
data_ps_fix.groupby('Genres').size() #verifico come sono compilati i generi(max 2 per app)

Genres
Action                                   299
Action;Action & Adventure                 12
Adventure                                 73
Adventure;Action & Adventure               5
Adventure;Brain Games                      1
Adventure;Education                        1
Arcade                                   184
Arcade;Action & Adventure                 14
Arcade;Pretend Play                        1
Art & Design                              57
Art & Design;Action & Adventure            1
Art & Design;Creativity                    6
Art & Design;Pretend Play                  1
Auto & Vehicles                           85
Beauty                                    53
Board                                     42
Board;Action & Adventure                   3
Board;Brain Games                         14
Board;Pretend Play                         1
Books & Reference                        222
Books & Reference;Creativity               1
Books & Reference;Education                2
Bus

Decido quindi di proseguire utilizzando come discriminante per distinguere più di un genere, quando il record contiene nel genere più voci separate da ' ; '

In [33]:
data_ps_fix['Genres_fix'] = [g.split(';') for g in data_ps_fix['Genres']] #split dei generi in un array
genres = data_ps_fix.Genres_fix.apply(pd.Series) #creo df
genres.head(5)

Unnamed: 0,0,1
0,Art & Design,
2,Art & Design,
3,Art & Design,
4,Art & Design,Creativity
5,Art & Design,


In [34]:
#sistemo colonne
genres = genres.rename(columns= {0:'1°_Genre', 1:'2°_Genre'})
genres.tail(5)

Unnamed: 0,1°_Genre,2°_Genre
10836,Education,
10837,Education,
10838,Medical,
10839,Books & Reference,
10840,Lifestyle,


In [35]:
apps = data_ps_fix['App'].to_frame() #per far si che non sia una serie ma df
type(apps)

pandas.core.frame.DataFrame

In [36]:
bridge = pd.merge(genres, apps, left_index=True, right_index=True)\
            .melt(id_vars = ['App'], value_name = 'Genre')\
            .drop('variable', axis = 1)\
            .dropna() #in modo da non ripetere più volte app che non hanno un secondo genere
bridge.head(5)

Unnamed: 0,App,Genre
0,Photo Editor & Candy Camera & Grid & ScrapBook,Art & Design
1,"U Launcher Lite – FREE Live Cool Themes, Hide ...",Art & Design
2,Sketch - Draw & Paint,Art & Design
3,Pixel Draw - Number Art Coloring Book,Art & Design
4,Paper flowers instructions,Art & Design


In [37]:
bridge[bridge['App'] == 'Pixel Draw - Number Art Coloring Book'] #check

Unnamed: 0,App,Genre
3,Pixel Draw - Number Art Coloring Book,Art & Design
9662,Pixel Draw - Number Art Coloring Book,Creativity


## _9. For each genre, create a new column of the original dataframe. The new columns must have boolean values (True if the app has a given genre)_

In [38]:
def create_dummies(df, colname):
    col_dummies = pd.get_dummies(df[colname].apply(pd.Series).stack()).sum(level=0)
    col_dummies.drop(col_dummies.columns[0], axis=1, inplace=True)
    dummies = pd.concat([df, col_dummies.astype(bool)], axis=1)
    dummies.drop(colname, axis = 1, inplace = True )
    return dummies

In [39]:
dum_genres = create_dummies(data_ps_fix, 'Genres_fix')
dum_genres[dum_genres['App'] == 'Pixel Draw - Number Art Coloring Book'][['App','Genres', 'Art & Design', 'Pretend Play',
                                                                          'Creativity','Education', 'Medical',
                                                                          'Lifestyle', 'Social']] #example

Unnamed: 0,App,Genres,Art & Design,Pretend Play,Creativity,Education,Medical,Lifestyle,Social
4,Pixel Draw - Number Art Coloring Book,Art & Design;Creativity,True,False,True,False,False,False,False


## _10. For each genre, compute the average rating. What is the genre with highest average?_

Utilizzo la codifica dei generi distinti in modo da preservarne l'integrità sul rating. Eseguo quindi un merge tra bridge e il df originale

In [40]:
genres_for_rating = bridge.merge(data_ps_fix, on='App')
genres_for_rating[['App', 'Genres', 'Genre']].head()

Unnamed: 0,App,Genres,Genre
0,Photo Editor & Candy Camera & Grid & ScrapBook,Art & Design,Art & Design
1,"U Launcher Lite – FREE Live Cool Themes, Hide ...",Art & Design,Art & Design
2,Sketch - Draw & Paint,Art & Design,Art & Design
3,Pixel Draw - Number Art Coloring Book,Art & Design;Creativity,Art & Design
4,Pixel Draw - Number Art Coloring Book,Art & Design;Creativity,Creativity


In [41]:
genres_for_rating.groupby('Genre')[['Rating']].mean().sort_values(by = 'Rating', ascending = False).head(5)

Unnamed: 0_level_0,Rating
Genre,Unnamed: 1_level_1
Events,4.435556
Puzzle,4.370732
Brain Games,4.358065
Art & Design,4.35
Books & Reference,4.343275


In [42]:
print('Il genere con rating più alto è ' + str(genres_for_rating.groupby('Genre')['Rating'].mean().idxmax()) + 
      ' con rating = ' + str(genres_for_rating.groupby('Genre')['Rating'].mean().max()))

Il genere con rating più alto è Events con rating = 4.435555555555557


## _11. For each app, compute the approximate income, obtain as a product of number of installs and price_

Per questa richiesta devo prima elaborare l'attributo price, in modo che sia una variabile numerica (vado a ripulirla da eventuali caratteri)

In [43]:
data_ps_fix['ValuePrice'] = [re.sub('[^0-9.]','', price) for price in data_ps_fix['Price']]
data_ps_fix['ValuePrice'] = data_ps_fix['ValuePrice'].apply(float)
data_ps_fix[['App','ValuePrice','Price']].sort_values(by = 'ValuePrice', ascending=False).head(5)


Unnamed: 0,App,ValuePrice,Price
4367,I'm Rich - Trump Edition,400.0,$400.00
5358,I am Rich!,399.99,$399.99
5351,I am rich,399.99,$399.99
9934,I'm Rich/Eu sou Rico/أنا غني/我很有錢,399.99,$399.99
4197,most expensive app (H),399.99,$399.99


Ora vado a creare la nuova colonna Income

In [44]:
data_ps_fix['Income'] = data_ps_fix['ValuePrice'] * data_ps_fix['N_Installs']
data_ps_fix[['App','Income']].sort_values(by = 'Income', ascending=False).head(5)

Unnamed: 0,App,Income
2241,Minecraft,69900000.0
5351,I am rich,39999000.0
5356,I Am Rich Premium,19999500.0
4034,Hitman Sniper,9900000.0
7417,Grand Theft Auto: San Andreas,6990000.0


## _12. For each app, compute its minimum and maximum Sentiment-polarity_

Per questa richiesta computo la massima e la minima sentiment-polarity per ogni app, singolarmente, creando così due df (max e min)

In [45]:
ur_max = data_ur.groupby('App')['Sentiment_Polarity'].max()
ur_max = pd.DataFrame(ur_max)
ur_max.head(3)

Unnamed: 0_level_0,Sentiment_Polarity
App,Unnamed: 1_level_1
10 Best Foods for You,1.0
104 找工作 - 找工作 找打工 找兼職 履歷健檢 履歷診療室,0.91
11st,1.0


In [46]:
#sistemo le colonne
ur_max = ur_max.rename(columns={'Sentiment_Polarity':'max_SP'})
ur_max.head(5)

Unnamed: 0_level_0,max_SP
App,Unnamed: 1_level_1
10 Best Foods for You,1.0
104 找工作 - 找工作 找打工 找兼職 履歷健檢 履歷診療室,0.91
11st,1.0
1800 Contacts - Lens Store,0.838542
1LINE – One Line with One Touch,1.0


Ripeto anche per quelle minime

In [47]:
ur_min = data_ur.groupby('App')['Sentiment_Polarity'].min()
ur_min = pd.DataFrame(ur_min)
#sistemo le colonne
ur_min = ur_min.rename(columns={'Sentiment_Polarity':'min_SP'})
ur_min.head(5)

Unnamed: 0_level_0,min_SP
App,Unnamed: 1_level_1
10 Best Foods for You,-0.8
104 找工作 - 找工作 找打工 找兼職 履歷健檢 履歷診療室,-0.1125
11st,-1.0
1800 Contacts - Lens Store,-0.3
1LINE – One Line with One Touch,-0.825


In [48]:
sp = ur_max.merge(ur_min, left_index=True, right_index=True)
sp.head(5)

Unnamed: 0_level_0,max_SP,min_SP
App,Unnamed: 1_level_1,Unnamed: 2_level_1
10 Best Foods for You,1.0,-0.8
104 找工作 - 找工作 找打工 找兼職 履歷健檢 履歷診療室,0.91,-0.1125
11st,1.0,-1.0
1800 Contacts - Lens Store,0.838542,-0.3
1LINE – One Line with One Touch,1.0,-0.825
