<a href="https://colab.research.google.com/github/Kanoru01/eagles-final-project/blob/master/Amazon_Fashion_Recommender_System.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CALL VOLUME PREDICTION

## Business Understanding

### Problem statement
As a result of an increased number of reported cases on child abuse, there is a dire need to have prompt responses. It is crucial that reported cases are acted upon promptly to ensure timely and efficient handling of reported cases.
### Business Question
The stakeholders are Mtoto News, a digital company that leverages on technology to improve child wellbeing, and Childline Kenya, an NGO established in response to the state of child protection in Kenya and the manner in which abuse cases were being reported and handled.

### Main objective
Building a forecast model that predicts the number of incoming calls that Childline Kenya will receive per hour per day..

### Supplementary Objective
Ensuring Childline Kenya is adequately staffed to handle reported cases. 


## DATA UNDERSTANDING

The dataset is provided on [Zindi Africa](https://zindi.africa/competitions/mtoto-news-childline-kenya-call-volume-prediction-challenge), and is part of a competition sponsored by Mtoto News and Childline Kenya. The data has been split into a test and training set. The training set contains all the calls (over 135,000) that were received from 1 January 2016 to 12 July 2016. You are asked to estimate the number of incoming calls per hour per day from 13 July 2016 to 6 September 2016. 

Each call contains the following fields:

- calldate - Date (month-day-year) and time of the call

- cc_status - Case status

- maincat - Main category call falls into

- subcat1 - Subcategory call falls under

- casepriority - Priority of the case

- referal -  Place case referred to

- caller_gender - Gender of the caller

- caller_age - Age of the caller

- caller_county - Area where the call came from

- child_age - Age of the child in case

- child_gender - Gender of the child in case

- child_county - Area where the child is from

- parent_age - Age of the parent

- parent_gender - Gender of the parent

- parent_county - Area where the child is from

- Abuser_Relationship - Relationship abuser has with the child in case

- Neglector_Relationship - Relationship neglector has with the child in case

- Physical_abuser_Relationship - Relationship physical abuser has with the child in case

## Loading the Data

### Loading libraries

Connecting to Google drive and Importing all the necessary Libraries | Modules.

In [None]:
#connecting collab to google drive 

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
#import all the relevant libraries

import pandas as pd
import numpy as np
import os, shutil

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns




### Loading the data

The data was loaded using `os.listdir` method

In [None]:
data= pd.read_json('/content/drive/Shareddrives/Eagles/marketing_sample_for_amazon_com-amazon_fashion_products__20200201_20200430__30k_data.ldjson', lines= True)

In [None]:
data.head()

Unnamed: 0,uniq_id,crawl_timestamp,asin,product_url,product_name,image_urls__small,medium,large,browsenode,brand,...,colour,no__of_reviews,seller_name,seller_id,left_in_stock,no__of_offers,no__of_sellers,technical_details__k_v_pairs,formats___editions,name_of_author_for_books
0,26d41bdc1495de290bc8e6062d927729,2020-02-07 05:11:36 +0000,B07STS2W9T,https://www.amazon.in/Facon-Kalamkari-Handbloc...,LA' Facon Cotton Kalamkari Handblock Saree Blo...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,1968255000.0,LA' Facon,...,,,,,,,,,,
1,410c62298852e68f34c35560f2311e5a,2020-02-07 08:45:56 +0000,B07N6TD2WL,https://www.amazon.in/Sf-Jeans-Pantaloons-T-Sh...,Sf Jeans By Pantaloons Men's Plain Slim fit T-...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,1968123000.0,,...,,,,,,,,,,
2,52e31bb31680b0ec73de0d781a23cc0a,2020-02-06 11:09:38 +0000,B07WJ6WPN1,https://www.amazon.in/LOVISTA-Traditional-Prin...,LOVISTA Cotton Gota Patti Tassel Traditional P...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,1968255000.0,LOVISTA,...,,,,,,,,,,
3,25798d6dc43239c118452d1bee0fb088,2020-02-07 08:32:45 +0000,B07PYSF4WZ,https://www.amazon.in/People-Printed-Regular-T...,People Men's Printed Regular fit T-Shirt,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,1968123000.0,,...,,,,,,,,,,
4,ad8a5a196d515ef09dfdaf082bdc37c4,2020-02-06 14:27:48 +0000,B082KXNM7X,https://www.amazon.in/Monte-Carlo-Cotton-Colla...,Monte Carlo Grey Solid Cotton Blend Polo Colla...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,https://images-na.ssl-images-amazon.com/images...,1968070000.0,,...,,,,,,,,,,


In [None]:
data.shape

(30000, 33)

In [None]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30000 entries, 0 to 29999
Data columns (total 33 columns):
 #   Column                         Non-Null Count  Dtype  
---  ------                         --------------  -----  
 0   uniq_id                        30000 non-null  object 
 1   crawl_timestamp                30000 non-null  object 
 2   asin                           30000 non-null  object 
 3   product_url                    30000 non-null  object 
 4   product_name                   30000 non-null  object 
 5   image_urls__small              29998 non-null  object 
 6   medium                         29998 non-null  object 
 7   large                          28841 non-null  object 
 8   browsenode                     29480 non-null  float64
 9   brand                          21857 non-null  object 
 10  sales_price                    27110 non-null  float64
 11  weight                         30000 non-null  object 
 12  rating                         30000 non-null 

In [None]:
data['rating'][0]

5.0