In [46]:
#required libraries
import numpy as np
import pandas as pd

from scipy.sparse import csr_matrix

## **Load the Dataset**
I’ve used the famous Online Retail dataset from the UCI Machine Learning Repository in this notebook.

In [47]:
#import the dataset
df = pd.read_csv("../input/recommendation-system/Online Retail.csv", 
                 encoding = "ISO-8859-1",
                dtype = {'CustomerID': str})
df.head()

## **Exploratory Data Analysis**

Looking at the table above, each row is a transaction with a few columns that are of interest:

* **InvoiceNo** (for the order)
* **StockCode** (the product)
* **Description** (of the product)
* **Quantity** (of products bought)
* **CustomerID** (of the customer)

These columns provide the values that describe the relationship between the customer and their purchases.

In [48]:
#dropping the irrelevent columns
df.drop(['InvoiceDate', 'UnitPrice', 'Country'], axis=1, inplace=True)

In [49]:
#renaming a column 
df.rename(
    columns={"ï»¿InvoiceNo":"InvoiceNo"}
          ,inplace=True)
df.head()

In [50]:
df.info()

## **Transforming data into the matrix**

We will recommend items based on user-user similarity and item-item similarity. For that, first we need to calculate the number of unique users and items. Now, we will create a user-item matrix which can be used to calculate the similarity between users and items.

In [51]:
item_matrix = df.pivot_table(index='InvoiceNo', columns=['Description'], values='Quantity').fillna(0)
item_matrix.head()

We see that the item matrix is heavily sparsed. I will use the `csr_matrix()` to reduce the memory utilization. 

In [52]:
csr_sample = csr_matrix(item_matrix)
print(csr_sample)

The negative quantities are the purchases that have been returned.

In [53]:
def get_recommendations(df, item):
    """Generate a set of product recommendations using item-based collaborative filtering.
    
    Args:
        df (dataframe): Pandas dataframe containing matrix of items purchased.
        item (string): Column name for target item. 
        
    Returns: 
        recommendations (dataframe): Pandas dataframe containing product recommendations. 
    """
    
    recommendations = df.corrwith(df[item])
    recommendations.dropna(inplace=True)
    recommendations = pd.DataFrame(recommendations, columns=['correlation']).reset_index()
    recommendations = recommendations.sort_values(by='correlation', ascending=False)
    
    return recommendations

In [54]:
recommendations = get_recommendations(item_matrix, 'WHITE HANGING HEART T-LIGHT HOLDER')
recommendations.head()

In [55]:
recommendations = get_recommendations(item_matrix, 'PARTY BUNTING')
recommendations.head()