# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint


## Objectives


Sales forecasting is the process of estimating future sales. Accurate sales forecasts enable companies to make sound business decisions and predict short-term or long-term performance. Forecasts could be based on data such as past sales, industry-wide comparisons and economic trends.

A leading retailer in USA wants to forecast sales for their product categories in their store, based on the sales history of each category. Sales or revenues forecasting is very important for retail operations. Forecasting of retail sales helps the retailer to take necessary measures to plan their budgets or investments in a period (monthly, yearly) among different product
categories like women's clothing, men's clothing and other clothing. Further, they can plan to minimize revenue loss from unavailability of products by investing accordingly.

**Note: This data is proprietary. Please DO NOT share the dataset with anyone. The solution python notebook and test solution will not be provided.** </br>

In [None]:
#@title Mini Hackathon Walkthrough Video
from IPython.display import HTML

HTML("""<video width="500" height="300" controls>
  <source src="https://cdn.talentsprint.com/talentsprint1/archives/sc/aiml/mini_hackathon_walkthrough.mp4" type="video/mp4">
</video>
""")

## Kaggle link and deadline:


### 1. Link to the Kaggle problem: https://www.kaggle.com/t/5fa027f6da7745219fd0a748107d8677

### 2. Deadlines:
  - **Competition closes at** 6:00 pm, 1st Aug 2020 IST
  - **Submit this Colab file with code to aimlkaggle@gmail.com:** 
      
      7.00pm 1st Aug 2020 IST

## Instructions:

- Refer to the document **M3_MiniHackathon2 - Kaggle Team Creation** for creating a Kaggle account and access the kaggle problem. Follow the steps for Team creation in Kaggle.  
- Under the 'Data' tab within the Kaggle competition page (link above), you can find four datasets. Their attributes are given in the "Attributes description".
- Follow **Stage 1** for downloading the data 
- Combine the datasets and apply data-preprocessing to obtain a clean training dataset
- Build your own model using any algorithms learnt till now
- **Get the Sales predictions for 2015 month-wise and product-wise** (36 rows)
- Copy and paste the predictions in column B (Sales(In ThousandDollars)) of the **Sample_Submission csv file** (ignore the headers)
- Upload the Sample_Submission csv file into Kaggle by clicking on Submit Predictions in Kaggle.
- The leaderboard takes and reflects your best submission until the specified deadline (maximum of 20 submissions per day per team). 

### **Important: Only the Public Leaderboard rankings are valid, not the Private Leaderboard rankings.**

## Evaluation: 
Evaluation will be done based on the teams placed in the Kaggle leaderboard

**TotalMarks=20**
- Top 5 teams will be awarded 20 marks
- 6-10 teams will be awarded 18
- 11-15 teams will be awarded 16
- 16-20 teams will be awarded 14
- Rest of the teams will be awarded 12
- **0 Marks in case of 0 submssions**
 

 ## Finally..
    Don't cheat!
    Apply yourself!
    Have fun!


## **Stage1:** Setting up colab for Kaggle competitions 
This setup helps you directly access the datasets etc of the Kaggle competition.

### 1. Create an API key in Kaggle.

To do this, go to kaggle.com/ and open your user settings page. Click My Account.

![alt text](https://i.stack.imgur.com/jxGQv.png
)



### 2. Next, scroll down to the API access section and click generate to download an API key. 
![alt text](https://i.stack.imgur.com/Hzlhp.png)

### 3. Upload your kaggle.json file using the following snippet in a code cell:



In [None]:
from google.colab import files
files.upload()

Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"ananthmucharla","key":"3de8f44c2c12b91df20617deba6cddb8"}'}

In [None]:
#If successfully uploaded in the above step, the 'ls' command here should display the kaggle.json file.
%ls

kaggle.json  [0m[01;34msample_data[0m/


### 4. Install the kaggle API using following command


In [None]:
!pip install -q kaggle

### 5. Move the kaggle.json file into ~/.kaggle, which is where the API client expects your token to be located:



In [None]:
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/

In [None]:
#Execute the following command to verify whether the kaggle.json is stored in the appropriate location: ~/.kaggle/kaggle.json
!ls ~/.kaggle

kaggle.json


In [None]:
!chmod 600 /root/.kaggle/kaggle.json #run this command to ensure your Kaggle API token is secure on colab

### 6. Now download the data

In [None]:
!mkdir data

**NOTE: If you get a '404 - Not Found' error after running the cell below, it is mostly likely that the user (whose kaggle.json is uploaded above) has not 'accepted' the rules of the competition and therefore has 'not joined' the competition.**

In [None]:
#If you get a forbidden link, you have most likely not joined the competition.

!kaggle competitions download -c retail-case-study-batch14 -p data

Downloading Sample_Submission.csv to data
  0% 0.00/309 [00:00<?, ?B/s]
100% 309/309 [00:00<00:00, 116kB/s]
Downloading MacroEconomicData.xlsx to data
  0% 0.00/17.4k [00:00<?, ?B/s]
100% 17.4k/17.4k [00:00<00:00, 18.0MB/s]
Downloading Train_Kaggle.csv to data
  0% 0.00/5.31k [00:00<?, ?B/s]
100% 5.31k/5.31k [00:00<00:00, 5.52MB/s]
Downloading Events_HolidaysData.xlsx to data
  0% 0.00/12.4k [00:00<?, ?B/s]
100% 12.4k/12.4k [00:00<00:00, 13.3MB/s]
Downloading AttributesDescription.xlsx to data
  0% 0.00/10.2k [00:00<?, ?B/s]
100% 10.2k/10.2k [00:00<00:00, 9.53MB/s]
Downloading WeatherData.xlsx to data
  0% 0.00/281k [00:00<?, ?B/s]
100% 281k/281k [00:00<00:00, 37.6MB/s]
Downloading Test_Kaggle.csv to data
  0% 0.00/793 [00:00<?, ?B/s]
100% 793/793 [00:00<00:00, 774kB/s]


## **Stage 2:** YOUR CODE to crack the Kaggle problem here. 

1.  Get the Sales prediction for the 2015 month-wise and product-wise (which give 36 rows). The product order for every month prediction can be as per test_kaggle.csv file.

2.  Copy and paste the predictions in Sample_Submission.csv (in Sales(In ThousandDollars)) and upload into Kaggle. 

After uploading the predictions in Kaggle, the RMSE score will be displayed on leaderboard.

Understand the RMSE score [here](https://medium.com/analytics-vidhya/forecast-kpi-rmse-mae-mape-bias-cdc5703d242d) with an example.

**Note: It is best advised to write all the code here. (If for any reason you are using other colab files, you could cut and paste the code from there into this notebook)**

In [None]:
#  <All Code here>

## **Stage 3:** Each time you submit in kaggle, ensure that the code given by you in Stage2 gives the same result. Follow the steps for the validation:
### a) Enter your Kaggle RMSE in the form below 
### b) After entering RMSE below, go to File->'Save and pin revision' (To ensure you do so, you are asked to mark 'Yes' to the instruction asking the same)
**Note: The Shortcut for 'Save and pin revision' is Ctrl+M+S**</br>
**Note: You can check if the action has succeeded by going to File->Revision History and you'll find "PIN" checkbox checked if successful.** 


- This action ensures there is 'proof of code' for each submission you make.
- If you submit your results in Kaggle, and get a leaderboard RMSE score, but you don't follow the steps asked above, then your **score will NOT be considered**, as we don't have the proof of your code. (We map the 'proof of code' by mapping it to your "RMSE+Time of save+pin"). In other words, if you want your RMSE score to be considered you have to follow the process. 
- However for trial submission (RMSE scores you don't care about being considered, as you're still experimenting in your initial attempts) you don't have to follow the process above.
- **One member from your team can collect all your team-members colab shared links and email them to aimlkaggle@gmail.com as per deadlines.** Ensure to give view access to aimlkaggle@gmail.com.
- **FINALLY: "Do NOT download and reupload this file as all the revision history will be lost"**




# Submit your RMSE value below:

Eg:   RMSEValue:     234.07

In [None]:
#@title Submission details are:

RMSEValue = '294.66519' #@param {type:"string"}
Execute_Save_and_Pin_revision_now = 'Yes' #@param ['Yes','No']
