# Usage Instructions - Mphasis HyperGraf Product Recommender

Product Recommender is a solution of HyperGraf that finds similarities between users and items simultaneously to provide recommendations. It analyzes the collected information on users' buying patterns and recommends items to each user based on similarity with other users. Mphasis HyperGraf is an omni-channel customer 360 analytics solution that transforms enterprise decision making by providing the most comprehensive, accurate, real-time and actionable customer engagement insights.

### Prerequisite

The kernel comes pre-installed with the required packages. Else ensure to have the following Python Packages in your environment at minimum:

    - numpy
    - pandas
    - surprise
    - collections

 ### Contents

1. [Input Data](#Input-Data)
1. [Create Model](#Create-Model)
1. [Batch Transform Job](#Batch-Transform-Job)
1. [Output Data](#Output-Data)

## Input Data

1)	The input dataset should be in csv format.

2)	The column names in input file should be:

    a.	ORDERID: Invoice Number.
    
    b.	SKUNUMBER: Stock Keeping Unit ID.
    
    c.	SKU DESCRIPTION: description of item, a string, name of item along with brand name and color name.
    
    d.	CUSTOMERID: It is specific to each customer.
    
3)	Invoice number is the systematically assigned sequential code which is unique to each invoice.

4)	More than one items may have same stock keeping unit id, but no item can have more than one stock keeping unit id.

5)	No item mentioned in the description can have more than one stock keeping unit (stock code).

In [1]:
import pandas as pd

In [2]:
input_df  = pd.read_csv("sampleData.csv")
input_df.head()

Unnamed: 0,ORDERID,SKUNUMBER,SKU DESCRIPTION,CUSTOMERID
0,524201,100100,SET OF 6 CFLs,500100
1,524201,100101,TONED MILK,500100
2,524201,100102,BUTTER,500100
3,524203,100103,PENCILS,500108
4,524202,100104,WRIST WATCH,500104


## Create Model

In [3]:
import boto3
import re

In [4]:
model_package_arn = 'arn:aws:sagemaker:us-east-2:786796469737:model-package/marketplace-collaboration--10-17'

In [5]:
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()

In [6]:
model = ModelPackage(model_package_arn=model_package_arn,
                    role = role,
                    sagemaker_session = sagemaker_session)

## Batch Transform Job

In [7]:
import json 
import uuid


transformer = model.transformer(1, 'ml.m5.large')
transformer.transform('s3://mphasis-marketplace/collaborative/sampleData.csv', content_type='text/csv')
transformer.wait()
#transformer.output_path
print("Batch Transform complete")
bucketFolder = transformer.output_path.rsplit('/')[3]

................[34m * Serving Flask app "serve" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: on
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 252-952-850[0m
[34m169.254.255.130 - - [30/Jan/2020 04:55:31] "GET /ping HTTP/1.1" 200 -[0m
[34m169.254.255.130 - - [30/Jan/2020 04:55:31] "GET /execution-parameters HTTP/1.1" 404 -[0m
[34mComputing the msd similarity matrix...[0m
[34mDone computing similarity matrix.[0m
[34m169.254.255.130 - - [30/Jan/2020 04:55:34] "POST /invocations HTTP/1.1" 200 -[0m

[32m2020-01-30T04:55:31.075:[sagemaker logs]: MaxConcurrentTransforms=1, MaxPayloadInMB=6, BatchStrategy=MULTI_RECORD[0m
Batch Transform complete


In [8]:
#print(s3bucket,s3prefix)
s3_conn = boto3.client("s3")
bucket_name="sagemaker-us-east-2-786796469737"
with open('FILE_NAME', 'wb') as f:
    s3_conn.download_fileobj(bucket_name, bucketFolder+'/sampleData.csv.out', f)
    print("Output file loaded from bucket")

Output file loaded from bucket


## Output Data

The output file (in csv format) contains the following columns:

a.	CUSTOMERID: It is the specific id of the customer for whom recommendations are mentioned in the adjacent column

b.	Recommendation: Top 10 product recommendations

Generated matrix contains the top 10 product recommendations for each user.

In [9]:
output_df  = pd.read_csv("FILE_NAME")
output_df  = output_df.drop('Unnamed: 0',1)
out_final = output_df
print("Output: ")
out_final.head()

Output: 


Unnamed: 0,CUSTOMERID,Recommendation
0,12426,"BIRDS MOBILE VINTAGE DESIGN, COFFEE MUG CAT + ..."
1,12427,"ROUND CONTAINER SET OF 5 RETROSPOT, ROUND SNAC..."
2,12468,"GLASS JAR ENGLISH CONFECTIONERY, ILLUSTRATED C..."
3,12471,"ILLUSTRATED CAT BOWL , ROUND CONTAINER SET OF ..."
4,12472,"GLASS JAR MARMALADE , GLASS JAR ENGLISH CONFEC..."
