_______________________________________________________________________________________________________________________________

In this module we are creating a predictive model that will analyze the type of the user. The base user data for the algorithm to learn from is a study taken from a base of 300 users through google form. The algorithm will be created and saved in a pickle file, which will be used within the Zelle website.

# Data Modeling

In [1]:
import pandas as pd
import numpy as np
import pickle

InputPath = "C:/Users/ginel/OneDrive/Desktop/Website/static/excel/"
UserData = str(InputPath) + str("UserDatabase.xlsx") 
MobilePhoneData = str(InputPath) + str("MobilePhone.xlsx") 
LaptopData = str(InputPath) + str("laptops.xlsx") 
TabletData = str(InputPath) + str("TabletPhone.xlsx") 

UserDatabase = pd.read_excel(UserData)
UserDatabase.drop("UserID",axis=1,inplace=True)

UserDatabase.head()

Unnamed: 0,Functionality,Photogenic,Battery,UserType
0,Storage,Yes,Less than an hour,Low Ended User
1,Speed,Maybe,Less than an hour,High Ended User
2,Storage,Yes,Between 3 - 4 hours,Low Ended User
3,Speed,No,More than 4 hours,High Ended User
4,Both,Yes,Less than an hour,Low Ended User


In [2]:
UserDatabase['Functionality'] = UserDatabase['Functionality'].map({'Storage':0,'Speed':1,'Both':2})
UserDatabase['Photogenic'] = UserDatabase['Photogenic'].map({'Yes':0,'No':1,'Maybe':2})
UserDatabase['Battery'] = UserDatabase['Battery'].map({'Less than an hour':0,'Between 3 - 4 hours':1,'More than 4 hours':2})
UserDatabase['UserType'] = UserDatabase['UserType'].map({'Low Ended User':0,'Medium Ended User':1,'High Ended User':2})

In [3]:
UserDatabase.head()

Unnamed: 0,Functionality,Photogenic,Battery,UserType
0,0,0,0,0
1,1,2,0,2
2,0,0,1,0
3,1,1,2,2
4,2,0,0,0


In [4]:
from sklearn.model_selection import train_test_split 
from sklearn.metrics import classification_report, confusion_matrix

X = UserDatabase.drop(['UserType'],axis=1)
y = UserDatabase['UserType']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.50, random_state=101)

In [5]:
from sklearn.ensemble import RandomForestClassifier

rmodel = RandomForestClassifier(n_estimators=100)
rmodel.fit(X_train,y_train)

rprediction = rmodel.predict(X_test)
print("Confusion Matrix")
print(confusion_matrix(y_test,rprediction))

rscore = round((rmodel.score(X_test, y_test)*100),2)
print ("\nModel Score:",rscore,"%")

Confusion Matrix
[[55  3 12]
 [ 3 29  0]
 [ 7  3 38]]

Model Score: 81.33 %


In [6]:
DecodedValues = ['Storage','Speed','Both','Yes','No','Maybe','Less than an hour','Between 3 - 4 hours','More than 4 hours']
CodededValues = [0,1,2,0,1,2,0,1,2]

def encryption(value):
    valueindex = DecodedValues.index(value)
    encodedvalue = CodededValues[valueindex]
    return encodedvalue

UserInput = ['Both','Yes','Less than an hour']

EncodedValues = map(encryption, UserInput)
EncodedValues = list(EncodedValues)

Inputs = [EncodedValues]
Predictions = rmodel.predict(Inputs)

Users = {0:'Low Ended User',1:'Medium Ended User',2:'High Ended User'}
UserType = Users[Predictions[0]]

print("\nUser Inputs = %s\n\nUser Prediction = %s" % (UserInput, UserType))


User Inputs = ['Both', 'Yes', 'Less than an hour']

User Prediction = High Ended User


In [7]:
pickle.dump(rmodel,open('ZelleModel.pickle','wb'))

The pickel model is saved for future use. As we have demonstrated above with every input of the user the algorithm predicts the type of the user. Similarly, we have manually categorized the data of the electronics as "Device Type". Thus we will use the predicted value from the algorithm to filter the data of the device and display appropriate results to our users. We will demonstrate this use case below

_______________________________________________________________________________________________________________________________

# Device Filteration

In [8]:
def Cleaning(Data=None):

    Phone = pd.read_excel(MobilePhoneData)
    Phone.drop_duplicates('Name',inplace=True)
    
    Tablet = pd.read_excel(TabletData)
    Tablet.drop_duplicates('Name',inplace=True)
    
    Laptop = pd.read_excel(LaptopData)
    Laptop.drop_duplicates('Name',inplace=True)
    
   # Phone Data Wrangling
    Phone['Name'].fillna('', inplace=True)
    Phone['Name'] = Phone['Name'].apply(lambda x : x.replace("Price","").strip())
    
    Phone['Brand'] = Phone['Name'].apply(lambda x : x.split(" ")[0])
    Phone['Brand'] = Phone['Brand'].apply(lambda x : 'I Kall' if x == 'I' else x)
    
    Phone['ROM'].fillna('', inplace=True)
    Phone['Expandable'] = Phone['ROM'].apply(lambda x : x.split(",")[1].strip().title() if len(x.split(",")) > 1 else '')
    Phone['ROM'] = Phone['ROM'].apply(lambda x : x.split(",")[0].strip() if x != '' else '')
    Phone['ROM'] = Phone['ROM'].apply(lambda x : x.replace("internal storage","").strip())
    
    Phone['RAM'].fillna('', inplace=True)
    Phone['RAM'] = Phone['RAM'].apply(lambda x : x.replace("RAM","").strip())
    
    Phone['Battery'].fillna('', inplace=True)
    Phone['Battery'] = Phone['Battery'].apply(lambda x : x.split("battery")[0].strip() if x != '' else '')
    Phone['Battery'] = Phone['Battery'].apply(lambda x : int(x.split("mAh")[0].strip()) if x != '' else 0)
    Phone['Expandable'] = Phone['Expandable'].apply(lambda x : x.split("Upto")[1].strip().upper() if len(x.split("Upto")) > 1 else x)
    
    Phone['Price'].fillna('0', inplace=True)
    Phone['Price'] = Phone['Price'].apply(lambda x : int(x.replace(",","").replace("Rs.","")))
    
    Phone = Phone[Phone['Price'] != 0]
    
    for i in range(len(Phone['Brand'].value_counts().index)):
        rec = Phone['Brand'].value_counts().index[i]
        try:
            recmode = Phone[Phone['Brand'] == rec]['OS'].mode()[0]
        except:
            recmode = ''
        Phone.loc[Phone['OS'].isnull() & Phone['Brand'].eq(rec),'OS'] = recmode
    
    for i in range(len(Phone)):
        record = Phone.index[i]
        if Phone['OS'][record] == '':
            if (Phone['RAM'][record] != '') | (Phone['ROM'][record] != ''):
                Phone.loc[Phone.index == record,'OS'] = 'Android'
    
    Phone['Ratings'].fillna(0, inplace=True)
    Phone['Ratings'] = Phone['Ratings'].apply(lambda x : int(x))
    Phone['Stars'].fillna(0, inplace=True)
    Phone['Stars'] = Phone['Stars'].apply(lambda x : 0 if x == '\xa0\xa0\xa0\xa0\xa0' else x)
    Phone['Stars'] = Phone['Stars'].astype('float')
    
    Phone = Phone[Phone['OS'] != '']
    
    RAMDatabase  = ['18 GB', '12 GB', '10 GB','8 GB','6 GB']  
    ROMDatabase  = ['512 GB','256 GB','128 GB'] 

    Phone['RAMDetails'] = Phone['RAM'].apply(lambda x : 1 if x in RAMDatabase else 0)
    Phone['ROMDetails'] = Phone['ROM'].apply(lambda x : 1 if x in ROMDatabase else 0)
    Phone['BatteryDetails'] = Phone['Battery'].apply(lambda x : 1 if x >= 30000 else 0)
    Phone['Specifications'].fillna('',inplace=True)
    Phone['CameraDetails'] = Phone['Specifications'].apply(lambda x : 1 if 'camera' in x.lower() else 0)
    Phone['Score'] = Phone['RAMDetails'] + Phone['ROMDetails'] + Phone['BatteryDetails'] + Phone['CameraDetails']
    Phone['Device Type'] = Phone['Score'].apply(lambda x : 'Low Ended User' if x <= 1 else ('Medium Ended User' if x == 2 else 'High Ended User'))

    Phone.drop(['RAMDetails','ROMDetails','BatteryDetails','CameraDetails'], inplace=True, axis=1)

    Phone.dropna(inplace=True)
    Phone.reset_index(drop=True,inplace=True)
    
    # Tablet Data Wrangling
    Tablet['Name'].fillna('', inplace=True)
    Tablet['Name'] = Tablet['Name'].apply(lambda x : x.replace("Price","").strip())
    Tablet['Brand'] = Tablet['Name'].apply(lambda x : x.split(" ")[0])
    
    Tablet['ROM'].fillna('', inplace=True)
    Tablet['ROM'] = Tablet['ROM'].apply(lambda x : x.split(",")[0].strip() if x != '' else '')
    Tablet['ROM'] = Tablet['ROM'].apply(lambda x : x.replace("Storage","").strip())
    
    Tablet['RAM'].fillna('', inplace=True)
    Tablet['RAM'] = Tablet['RAM'].apply(lambda x : x.replace("RAM","").strip())
    
    Tablet['Screen'].fillna('', inplace=True)
    Tablet['Screen'] = Tablet['Screen'].apply(lambda x : x.replace("inch Screen","").strip())
    
    Tablet['Battery'].fillna('', inplace=True)
    Tablet['Battery'] = Tablet['Battery'].apply(lambda x : x.replace("mAh","").strip())
    Tablet['Battery'] = Tablet['Battery'].apply(lambda x : np.nan if x == '' else int(x))
    
    Tablet['Price'].fillna('0', inplace=True)
    Tablet['Price'] = Tablet['Price'].apply(lambda x : int(x.replace(",","").replace("Rs.","")))
    
    Tablet['OS'] = Tablet['Brand'].apply(lambda x : 'iOS' if x == 'Apple' else ('Windows' if ((x == 'Microsoft')|(x == 'Notion')) else 'Android'))
    
    Tablet = Tablet[Tablet['Price'] != 0]
    
    Tablet = Tablet[Tablet['OS'] != '']
    
    RAMDatabase  = ['8GB','6GB', '4GB']
    ROMDatabase  = ['256GB', '128GB', '64GB', '16GB']

    Tablet['RAMDetails'] = Tablet['RAM'].apply(lambda x : 1 if x in RAMDatabase else 0)
    Tablet['ROMDetails'] = Tablet['ROM'].apply(lambda x : 1 if x in ROMDatabase else 0)
    Tablet['Battery'].fillna(0,inplace=True)
    Tablet['Battery'] = Tablet['Battery'].astype('int64') 
    Tablet['BatteryDetails'] = Tablet['Battery'].apply(lambda x : 1 if x >= 5000 else 0)
    Tablet['Score'] = Tablet['RAMDetails'] + Tablet['ROMDetails'] + Tablet['BatteryDetails'] 
    Tablet['Device Type'] = Tablet['Score'].apply(lambda x : 'Low Ended User' if x == 0 else ('High Ended User' if x == 3 else 'Medium Ended User'))

    Tablet.drop(['RAMDetails','ROMDetails','BatteryDetails'], inplace=True, axis=1)

    Tablet.dropna(inplace=True)
    Tablet.reset_index(drop=True,inplace=True)
    
    # Laptop Data Wrangling
    Laptop['Name'].fillna('', inplace=True)
    Laptop['Name'] = Laptop['Name'].apply(lambda x : x.replace("Price","").strip())
    
    Laptop['Name'] = Laptop['Name'].apply(lambda x : x.split("(")[0].strip() if x != '' else '')
    
    Laptop['Brand'] = Laptop['Name'].apply(lambda x : x.split(" ")[0].title())
    
    Laptop['Price'].fillna('0', inplace=True)
    Laptop['Price'] = Laptop['Price'].apply(lambda x : int(x.replace(",","").replace("Rs.","")))
    
    Laptop = Laptop[Laptop['Price'] != 0]
    
    Laptop['RAM'].fillna('', inplace=True)
    Laptop['RAM'] = Laptop['RAM'].apply(lambda x : x.replace("RAM","").strip())
    
    Laptop['ROM'].fillna('', inplace=True)
    Laptop['ROM'] = Laptop['ROM'].apply(lambda x : x.replace("SSD","").replace("HDD","").strip())
    
    Laptop['Battery'].fillna("0", inplace=True)
    Laptop['Battery'] = Laptop['Battery'].apply(lambda x : float(x.replace("Hrs","").strip()))
    Laptop['Battery'] = Laptop['Battery'].apply(lambda x : np.nan if x == 0 else x)
    Laptop['Battery'].fillna(round(Laptop['Battery'].mean()), inplace=True)
    
    for i in range(len(Laptop['Brand'].value_counts().index)):
        rec = Laptop['Brand'].value_counts().index[i]
        try:
            recmode = Laptop[Laptop['Brand'] == rec]['OS'].mode()[0]
        except:
            recmode = 'Windows 10'
        Laptop.loc[Laptop['OS'].isnull() & Laptop['Brand'].eq(rec),'OS'] = recmode
        
    for i in range(len(Laptop['Brand'].value_counts().index)):
        rec = Laptop['Brand'].value_counts().index[i]
        try:
            recmode = Laptop[Laptop['Brand'] == rec]['Processor'].mode()[0]
        except:
            recmode = 'Intel Core i3 (10th Gen)'
        Laptop.loc[Laptop['Processor'].isnull() & Laptop['Brand'].eq(rec),'Processor'] = recmode
        
    for i in range(len(Laptop[['OS','Processor']].value_counts().index)):
        rec = Laptop[['OS','Processor']].value_counts().index[i]
        recmode = Laptop[(Laptop['OS'] == rec[0]) | (Laptop['Processor'] == rec[1])]['Webcam'].mode()[0]
        Laptop.loc[Laptop['Webcam'].isnull() & Laptop['OS'].eq(rec[0]) & Laptop['Processor'].eq(rec[1]),'Webcam'] = recmode 
        
    Laptop['RAM'].fillna(Laptop['RAM'].mode()[0], inplace=True)
    Laptop['ROM'].fillna(Laptop['ROM'].mode()[0], inplace=True)
    Laptop['Ratings'].fillna(0, inplace=True)
    Laptop['Ratings'] = Laptop['Ratings'].apply(lambda x : int(x))
    
    Laptop['Stars'].fillna(0, inplace=True)
    Laptop['Stars'] = Laptop['Stars'].apply(lambda x : 0 if x == '\xa0\xa0\xa0\xa0\xa0' else x)
    Laptop['Stars'] = Laptop['Stars'].astype('float')
        
    Laptop = Laptop[Laptop['OS'] != '']
    
    RAMDatabase  = ['32GB', '16GB','8GB']
    ROMDatabase  = ['2 TB','1 TB','750 GB','512 GB','500 GB']

    Laptop['RAMDetails'] = Laptop['RAM'].apply(lambda x : 1 if x in RAMDatabase else 0)
    Laptop['ROMDetails'] = Laptop['ROM'].apply(lambda x : 1 if x in ROMDatabase else 0)
    Laptop['Battery'].fillna(0,inplace=True)
    Laptop['Battery'] = Laptop['Battery'].astype('int64') 
    Laptop['BatteryDetails'] = Laptop['Battery'].apply(lambda x : 1 if x >= Laptop['Battery'].mean() else 0)
    Laptop['Specifications'].fillna('',inplace=True)
    Laptop['CameraDetails'] = Laptop['Specifications'].apply(lambda x : 1 if 'hd' in x.lower() else 0)

    Laptop['Score'] = Laptop['RAMDetails'] + Laptop['ROMDetails'] + Laptop['BatteryDetails'] + Laptop['CameraDetails']
    Laptop['Device Type'] = Laptop['Score'].apply(lambda x : 'Low Ended User' if x <= 1 else ('Medium Ended User' if x == 2 else 'High Ended User'))

    Laptop.drop(['RAMDetails','ROMDetails','BatteryDetails','CameraDetails'], inplace=True, axis=1)

    Laptop.dropna(inplace=True)
    Laptop.reset_index(drop=True,inplace=True)
    
    return Phone, Tablet, Laptop

In [9]:
PhoneDB, TabletDB, LaptopDB = Cleaning()

The data has been cleaned, processed and categorized into device types. Now we assume various situations where the model has given us a prediction as to the type of the user as well as the user has given a budget as an input. Thus the following filtration will be done to display optimal results.

In [10]:
LaptopDB[(LaptopDB['Device Type'] == 'Low Ended User') & (LaptopDB['Price'] <= 15000)].head(6).reset_index(drop=True)

Unnamed: 0,Name,Ratings,Stars,OS,RAM,ROM,Battery,Processor,Webcam,Specifications,Price,Image,Link,Brand,Score,Device Type
0,iBall Compbook OHD Laptop,17,3.5,Windows 10,,,10,Intel Atom Quad-Core,"Yes, HD Webcam",,10699,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/laptop/iball-compbook-ohd,Iball,1,Low Ended User
1,I-Life Zed Air,0,0.0,Windows 10,2 GB,32 GB,8,Intel Atom Quad-Core,"Yes, HD Webcam","14 Inches, 1366 x 768 Screen Resolution, Windo...",8990,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/laptop/i-life-zed-air-in...,I-Life,1,Low Ended User
2,Lava Helium 14,569,3.4,Windows 10,2 GB,32 GB,9,Intel Atom Quad-Core,"Yes, HD Webcam","14.1 Inches, 1920 x 1080 Screen Resolution, Wi...",14999,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/laptop/lava-helium-14-in...,Lava,1,Low Ended User
3,RDP ThinBook 1450-EC1,119,3.4,Windows 10,2 GB,32 GB,8,Intel Atom Quad-Core,"Yes, HD Webcam","14.1 Inches, 1366 x 768 Screen Resolution, Win...",12490,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/laptop/rdp-thinbook-1450...,Rdp,1,Low Ended User
4,Reach Cosmos RCN-021w,268,3.3,Windows 10,,,10,Intel Core i5 (4th Gen),"Yes, HD Webcam",,8499,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/laptop/reach-cosmos-rcn-...,Reach,1,Low Ended User
5,Reach MI1041R,0,0.0,Windows 10,,,10,Intel Core i5 (4th Gen),"Yes, HD Webcam",,7599,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/laptop/reach-mi1041r-aqc...,Reach,1,Low Ended User


In [11]:
PhoneDB[(PhoneDB['Device Type'] == 'Low Ended User') & (PhoneDB['Price'] > 15000) & (PhoneDB['Price'] <= 30000)].head(6).reset_index(drop=True)

Unnamed: 0,Name,Ratings,Stars,OS,RAM,ROM,Battery,Specifications,Price,Image,Link,Brand,Expandable,Score,Device Type
0,Samsung Galaxy M30s,214132,4.3,Android v9.0 (Pie) Upgradable to v10 (Q),4 GB,64 GB,6000,"Android v9.0 (Pie) Upgradable to v10 (Q), 6.4 ...",20500,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/mobile/samsung-galaxy-m30s,Samsung,512 GB,1,Low Ended User
1,Samsung Galaxy M30,4181,4.4,Android v8.1 (Oreo) Upgradable to v10 (Q),4 GB,64 GB,5000,"Android v8.1 (Oreo) Upgradable to v10 (Q), 6.4...",16480,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/mobile/samsung-galaxy-m30,Samsung,512 GB,1,Low Ended User
2,Samsung Galaxy A50,67695,4.3,Android v9.0 (Pie) Upgradable to v10 (Q),4 GB,64 GB,4000,"Android v9.0 (Pie) Upgradable to v10 (Q), 6.4 ...",15899,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/mobile/samsung-galaxy-a50,Samsung,512 GB,1,Low Ended User
3,Vivo Z1 Pro,355595,4.5,Android v9.0 (Pie),4 GB,64 GB,5000,"Android v9.0 (Pie), 6.53 inches (16.59 cm) bez...",15790,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/mobile/vivo-z1-pro,Vivo,256 GB,1,Low Ended User
4,Apple iPhone 7,94764,4.5,iOS v10 Upgradable to v11.2,2 GB,32 GB,1960,"iOS v10 Upgradable to v11.2, 4.7 inches (11.94...",24999,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/mobile/apple-iphone-7,Apple,Non-Expandable Memory,1,Low Ended User
5,Apple iPhone 6s 64GB,93094,4.5,iOS v9 Upgradable to v11.2,2 GB,64 GB,1715,"iOS v9 Upgradable to v11.2, 4.7 inches (11.94 ...",23999,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/mobile/apple-iphone-6s-64gb,Apple,Non-Expandable Memory,1,Low Ended User


In [12]:
TabletDB[(TabletDB['Device Type'] == 'Low Ended User') & (TabletDB['Price'] > 30000)].head(6).reset_index(drop=True)

Unnamed: 0,Name,RAM,ROM,Screen,Battery,Specification,Price,Image,Link,Brand,OS,Score,Device Type
0,Apple iPad 10.2 2020 WiFi + Cellular 32GB,3GB,32GB,10.2,0,"10.2 inch Screen, 3GB RAM, 32GB Storage",39890,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/tablet/apple-ipad-10-2-2...,Apple,iOS,0,Low Ended User
1,Apple iPad Pro 12.9 2021 WiFi 2TB,16GB,,12.9,0,"12.9 inch Screen, 16GB RAM",198900,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/tablet/apple-ipad-pro-12...,Apple,iOS,0,Low Ended User
2,Asus PadFone,1GB,32GB,4.3,1520,"4.3 inch Screen, 1GB RAM, 32GB Storage, 1520mAh",64999,https://cdn.pricebaba.com/prod/images/product/...,https://pricebaba.com/tablet/asus-padfone,Asus,Android,0,Low Ended User


In the similar way this module will be attached to the backend of the website.

_______________________________________________________________________________________________________________________________