# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint
## Not for grading

## Learning Objective

At the end of the experiment, you will be able to :

* understand how the KNN classifier works for different 'k' values
* perform standard scalar on the data

## Dataset

### History

Social network advertising, also social media targeting, is a group of terms that are used to describe forms of online advertising that focus on social networking services. One of the major benefits of this type of advertising is that advertisers can take advantage of the users’ demographic information and target their ads appropriately. Advantages are advertisers can reach users who are interested in their products, allows for detailed analysis and reporting, information gathered is real, not from statistical projections, does not access IP-addresses of the users.

### Description

The dataset chosen for this  experiment is Social Network Ads. The dataset contains 400 records with 5 columns representing the below details.

Data contains 5 columns:


**UserID** - Each person has a unique ID from which we can identify the person uniquely.

**Gender** - Person can be male or female.

**Age** - Age of the person.

**EstimatedSalary** - This column contains salary of a person.

**Purchased** - Contains two numbers ‘0’ or ‘1’. ‘0’ means not purchased and ‘1’ means purchased. This variable is our target variable.

## Setup Steps

In [None]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "" #@param {type:"string"}


In [None]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password = "" #@param {type:"string"}


In [None]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython

ipython = get_ipython()

notebook= "Demo_KNN_Advertising_data" #name of the notebook
Answer = "Ungraded"
def setup():
#  ipython.magic("sx pip3 install torch")
    from IPython.display import HTML, display
    ipython.magic("sx wget https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/social_advertising.csv")
    display(HTML('<script src="https://dashboard.talentsprint.com/submissions/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():

    ipython.magic("notebook -e "+ notebook + ".ipynb")

    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword(), "batch" : ""}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:
        print(r["err"])
        return None
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None

    elif getComplexity() and getAdditional() and getConcepts() and getComments():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional,
              "concepts" : Concepts, "record_id" : submission_id,
              "id" : Id, "file_hash" : file_hash,
              "feedback_experiments_input" : Comments, "notebook" : notebook, "batch" : ""}

      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:
        print(r["err"])
        return None
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://learn-iiith.talentsprint.com/notebook_submissions")
        # print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
      return submission_id
    else: submission_id


def getAdditional():
  try:
    if not Additional:
      raise NameError
    else:
      return Additional
  except NameError:
    print ("Please answer Additional Question")
    return None
def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None

def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None

def getId():
  try:
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup()

else:
  print ("Please complete Id and Password cells before running setup")


## Import required packages

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Load the data

In [None]:
adv = pd.read_csv("social_advertising.csv")
print(adv.shape)
adv.head()

In [None]:
print(adv['Purchased'].unique())

In [None]:
adv['Gender'] = adv['Gender'].replace(['Female','Male'],[1, 0])

In [None]:
# Identify the features and labels
data = adv.drop(['Purchased', 'User ID' ], axis=1)   # data =adv[['Gender', 'Age', 'Es']] # adv.iloc[:, 1:4]
labels = adv['Purchased']

In [None]:
# Print the first five rows of the features
data.head()

## Split the data into train and test sets

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
 # Split the train and test set ratio is 70 : 30
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.3, random_state=42)

In [None]:
# The size of train and  test sets
X_train.shape, X_test.shape, y_train.shape, y_test.shape

## Train a  Knn Classifier

In [None]:
from sklearn.neighbors import KNeighborsClassifier
k = 3
neigh = KNeighborsClassifier(n_neighbors=k)

In [None]:
# Train or fit the model with the train data
neigh.fit(X_train,y_train)
# Test the trained model
y_pred = neigh.predict(X_test)

In [None]:
y_test

In [None]:
# Calculate the score
# print(neigh.score(X_test, y_test))
from sklearn.metrics import accuracy_score
print('Accuaracy :',accuracy_score(y_pred,y_test))

In [None]:
print('Accuaracy :',accuracy_score(y_test,y_pred))

## Scaling the data and Classify

In [None]:
# Scaling the data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()


X_train_scale = scaler.fit_transform(X_train)
X_test_scale = scaler.transform(X_test)

# Classify using Scaled data
k = 3
neigh_Scale = KNeighborsClassifier(n_neighbors=k)
neigh_Scale.fit(X_train_scale, y_train)
y_pred_Scale = neigh_Scale.predict(X_test_scale)
neigh_Scale.score(X_test_scale, y_test)
print(accuracy_score(y_pred_Scale,y_test))

In [None]:
# Plot before and after scaling
import matplotlib.pyplot as plt
plt.scatter(X_train.iloc[:,1], X_train.iloc[:,2], c=y_train, cmap='viridis')
plt.show()
plt.scatter(X_train_scale[:,1], X_train_scale[:,2], c=y_train, cmap='viridis')
plt.axvline(c = 'black',ls ='--')
plt.axhline(c = 'black',ls ='--')
plt.show()

## Exercise : Try to observe the change in score by changing the value of k

In [None]:
# YOUR CODE HERE

### Please answer the questions below to complete the experiment:




In [None]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [None]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "" #@param {type:"string"}


In [None]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "" #@param ["","Yes", "No"]


In [None]:
#@title  Experiment walkthrough video? { run: "auto", vertical-output: true, display-mode: "form" }
Walkthrough = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Run this cell to submit your notebook for Ungrading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")