# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

## Learning Objective

At the end of the experiment, you will be able to :

* Classify fruits data using KNN classifier
* Visualize the predictions before and after scaling

## Dataset

The dataset chosen for this  experiment is a handmade fruits dataset. The dataset contains 60 records. Each record represents the following details of fruits :

*  Weight -   It is the mass of an object. With respect to this dataset, we have calculated the weights in grams

* Sphericity -   is a measure of how closely the shape of an object approaches that of a mathematically perfect sphere.

* Color -  Every fruit has a different color at different stages. You can encode the color to an integer value. For example

     - Green as 20
     - Greenish Yellow as 40
     - Orange as 60
     - Red as 80
     - Reddish Yellow as 100

*  Label -   We have considered two fruits for simplicity. They are Apple and Orange.




## Setup Steps

In [None]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "" #@param {type:"string"}


In [None]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password = "" #@param {type:"string"}


In [None]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython

ipython = get_ipython()

notebook= "Demo_KNN_Scaling" #name of the notebook
Answer = "Ungraded"
def setup():
#  ipython.magic("sx pip3 install torch")
    from IPython.display import HTML, display
    ipython.magic("sx wget https://cdn.talentsprint.com/aiml/Experiment_related_data/fruits_weight_sphercity.csv")
    display(HTML('<script src="https://dashboard.talentsprint.com/submissions/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():

    ipython.magic("notebook -e "+ notebook + ".ipynb")

    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword(), "batch" : ""}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:
        print(r["err"])
        return None
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None

    elif getComplexity() and getAdditional() and getConcepts() and getComments():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional,
              "concepts" : Concepts, "record_id" : submission_id,
              "id" : Id, "file_hash" : file_hash,
              "feedback_experiments_input" : Comments, "notebook" : notebook, "batch" : ""}

      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:
        print(r["err"])
        return None
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://learn-iiith.talentsprint.com/notebook_submissions")
        # print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
      return submission_id
    else: submission_id


def getAdditional():
  try:
    if not Additional:
      raise NameError
    else:
      return Additional
  except NameError:
    print ("Please answer Additional Question")
    return None
def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None

def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None

def getId():
  try:
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup()

else:
  print ("Please complete Id and Password cells before running setup")


## Import required packages

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [None]:
fruits_data = pd.read_csv('fruits_weight_sphercity.csv')
fruits_data.head()

In [None]:
# Encode the labels and Color column
fruits_data['Color'] = fruits_data['Color'].replace(['Green', 'Greenish yellow', 'Orange', 'Red','Reddish yellow'],[20, 40, 60, 80, 100])
fruits_data['labels'] = fruits_data['labels'].replace(['apple','orange'],[1, 0])

In [None]:
fruits_data.shape

## Take the data samples for training after the interval of  3

In [None]:
# Consider only 20 samples for Train Set
train = fruits_data[0:60:3]
train

## Check the length of the dataset

In [None]:
print(len(fruits_data))
print(len(train))
print(type(train))

In [None]:
# Consider 5 samples for Test set after the interval of 10
test = fruits_data[1:50:10]
test

In [None]:
print(len(test))

In [None]:
# Features of training data and testing data
traindata = train.iloc[:, 1:3]
testdata = test.iloc[:, 1:3]

In [None]:
traindata.head()

In [None]:
testdata.head()

In [None]:
traindata.shape, testdata.shape

In [None]:
train.labels

## Apply KNN Classifier on the data

In [None]:
from sklearn.neighbors import KNeighborsClassifier
k = 3
neigh = KNeighborsClassifier(n_neighbors=k)

In [None]:
# Train or fit the model with the train data
neigh.fit(traindata, train.labels)

# Test the trained model
predictions = neigh.predict(testdata)

In [None]:
print(predictions, "predictions")
print(test.labels.values, "Actual_labels")

In [None]:
# Stack the test data with predictions (can be used for plotting)
predicted_data = np.column_stack((testdata.iloc[:,:2], predictions))

predicted_df = pd.DataFrame(predicted_data, columns = ['Weight','Sphericity', 'labels'])
predicted_df.head()

## Plot the train, test and predictions before scaling

In [None]:
import matplotlib.pyplot as plt
from mlxtend.plotting import category_scatter

def plotting(traindata, testdata, df_Pred):

  Oranges_train, Oranges_test = traindata[traindata.labels == 0], testdata[testdata.labels == 0]
  Apples_train, Apples_test = traindata[traindata.labels == 1], testdata[testdata.labels == 1]

  Oranges_pred = df_Pred[df_Pred.iloc[:,2] == 0]
  Apples_pred = df_Pred[df_Pred.iloc[:,2] == 1]

  Oranges_train.shape , Apples_train.shape, Oranges_test.shape, Apples_test.shape, Oranges_pred.shape, Apples_pred.shape

  df1 = (pd.concat([Oranges_train, Oranges_test, Apples_train, Apples_test], axis=0, keys=('Oranges_train', 'Oranges_test', 'Apples_train', 'Apples_test'))
          .swaplevel(0,1, axis=0))
  df1 = df1.reset_index(level=1)
  df2 = (pd.concat([Oranges_train, Oranges_pred, Apples_train, Apples_pred], axis=0, keys=('Oranges_train', 'Oranges_pred', 'Apples_train','Apples_pred'))
          .swaplevel(0,1, axis=0))
  df2 = df2.reset_index(level=1)

  fig = category_scatter(x='Sphericity', y='Weight', label_col='level_1',
                        data=df1, markers='*o*o', colors=('red', 'red', 'green', 'green'), markersize=50, legend_loc='upper left')

  fig = category_scatter(x='Sphericity', y='Weight', label_col='level_1',
                        data=df2, markers='*o*o', colors=('red', 'red', 'green', 'green'), markersize=50, legend_loc='upper left')

In [None]:
plotting(train, test, predicted_df)

## Scaling the data

In [None]:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

In [None]:
# Data Before Scaling
fruits_data.head()

In [None]:
fruits_data[['Sphericity', 'Weight']] = scaler.fit_transform(fruits_data[['Sphericity', 'Weight']])

In [None]:
# Data After Scaling
fruits_data.head()

### Take the data samples for training after the interval of  3

In [None]:
Train = fruits_data[0:60:3]
Train.head()

In [None]:
Test = fruits_data[1:50:10]
Test

In [None]:
print(len(Test))

### Apply KNN Classifier on the scaled data

In [None]:
from sklearn.neighbors import KNeighborsClassifier
k = 3
Neigh = KNeighborsClassifier(n_neighbors=k)

In [None]:
# Convert dataframe into array
Traindata = Train.iloc[:,1:3]
Testdata = Test.iloc[:,1:3]

In [None]:
# Train or fit the model with the train data
neigh.fit(Traindata, Train.labels)

# Test the trained model
scaled_predictions = neigh.predict(Testdata)

In [None]:
print(scaled_predictions,"predictions")
print(Test.labels.values,"labels")

In [None]:
predicted_data = np.column_stack((Testdata.iloc[:,:2], scaled_predictions))

df_Pred_scale = pd.DataFrame(predicted_data, columns = ['Weight','Sphericity', 'labels'])
df_Pred_scale.head()

### Plot the train and test points after scaling

In [None]:
plotting(Train, Test, df_Pred_scale)

### Please answer the questions below to complete the experiment:




In [None]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [None]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "" #@param {type:"string"}


In [None]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "" #@param ["","Yes", "No"]


In [None]:
#@title  Experiment walkthrough video? { run: "auto", vertical-output: true, display-mode: "form" }
Walkthrough = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Run this cell to submit your notebook for Ungrading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")