# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

## Learning Objective

At the end of the experiment, you will be able to :

* Apply Linear Regression on different datasets

## Dataset

### Description

For this experiment we have chosen three datasets:

1. Human height and weight dataset

2. Human brain weight and head size dataset

3. California Housing dataset

**Human height and weight dataset**

This is made up of 10000 records. It contains 2 columns / features. The first column represents height and the second column represents weight of the person.

**Human brain weight and head size dataset**

The dataset is made up of 237 records. It contains 2 columns / features. The first column represents head size (cms) and the second column represents brain weight(grams) of the person.

**California housing dataset**

Number of Attributes: 8 numeric, predictive attributes and the target

This dataset was obtained from the StatLib repository.
https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html

The target variable is the median house value for California districts,
expressed in hundreds of thousands of dollars ($100,000).

This dataset was derived from the 1990 U.S. census, using one row per census
block group. A block group is the smallest geographical unit for which the U.S.
Census Bureau publishes sample data (a block group typically has a population
of 600 to 3,000 people).

A household is a group of people residing within a home. Since the average
number of rooms and bedrooms in this dataset are provided per household, these
columns may take surprisingly large values for block groups with few households
and many empty houses, such as vacation resorts.


        - HouseAge      median house age in block group
        - AveRooms      average number of rooms per household
        - AveBedrms     average number of bedrooms per household
        - Population    block group population
        - AveOccup      average number of household members
        - Latitude      block group latitude
        - Longitude     block group longitude


**File names :**

* new_height_weight.csv
* HumanBrain_WeightandHead_size.csv

### Setup Steps:

In [21]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "2302815" #@param {type:"string"}

In [22]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password = "+6592721549" #@param {type:"string"}

In [23]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython

ipython = get_ipython()

notebook= "U1W3_09_Linear_Regression_Multi_Datasets_A" #name of the notebook

def setup():
#  ipython.magic("sx pip3 install torch")
    from IPython.display import HTML, display
    ipython.magic("sx wget https://cdn.talentsprint.com/aiml/Experiment_related_data/HumanBrain_WeightandHead_size.csv")
    ipython.magic("sx wget https://cdn.talentsprint.com/aiml/Experiment_related_data/new_height_weight.csv")
    display(HTML('<script src="https://staging.dashboard.talentsprint.com/aiml/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():
    ipython.magic("notebook -e "+ notebook + ".ipynb")

    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword()}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:
        print(r["err"])
        return None
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None

    elif getAnswer() and getComplexity() and getAdditional() and getConcepts() and getWalkthrough() and getComments() and getMentorSupport():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional,
              "concepts" : Concepts, "record_id" : submission_id,
              "answer" : Answer, "id" : Id, "file_hash" : file_hash,
              "notebook" : notebook, "feedback_walkthrough":Walkthrough ,
              "feedback_experiments_input" : Comments,
              "feedback_inclass_mentor": Mentor_support}

      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:
        print(r["err"])
        return None
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://aiml-iiith.talentsprint.com/notebook_submissions")
        #print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
        return submission_id
    else: submission_id


def getAdditional():
  try:
    if not Additional:
      raise NameError
    else:
      return Additional
  except NameError:
    print ("Please answer Additional Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None

def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None


def getWalkthrough():
  try:
    if not Walkthrough:
      raise NameError
    else:
      return Walkthrough
  except NameError:
    print ("Please answer Walkthrough Question")
    return None

def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None


def getMentorSupport():
  try:
    if not Mentor_support:
      raise NameError
    else:
      return Mentor_support
  except NameError:
    print ("Please answer Mentor support Question")
    return None

def getAnswer():
  try:
    if not Answer:
      raise NameError
    else:
      return Answer
  except NameError:
    print ("Please answer Question")
    return None


def getId():
  try:
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup()
else:
  print ("Please complete Id and Password cells before running setup")



Setup completed successfully


## Import Required Packages

In [24]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import warnings
warnings.filterwarnings('ignore')

## Load the Datasets

1. Load the height weight dataset

In [25]:
height_weight_data = pd.read_csv("/content/new_height_weight.csv")
height_weight_data.head()

Unnamed: 0,Height,Weight
0,73.847017,241.893563
1,68.781904,162.310473
2,74.110105,212.740856
3,71.730978,220.04247
4,69.881796,206.349801


2. Load the Human Brain Weight Head Size dataset

In [26]:
HumanBrain_WeightandHead_Size = pd.read_csv("/content/HumanBrain_WeightandHead_size.csv")
HumanBrain_WeightandHead_Size.head()

Unnamed: 0,Head Size(cm^3),Brain Weight(grams)
0,4512,1530
1,3738,1297
2,4261,1335
3,3777,1282
4,4177,1590


3. Load the california housing data

In [27]:
# Load the dataset from sklearn datasets package
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()

HTTPError: HTTP Error 403: Forbidden

In [28]:
# data: Contains the information for various houses (features)
# target: Prices of the house (MEDV - Median value of owner-occupied homes in $1000s)
# feature_names: Names of the features
# DESCR: Describes the dataset

print(housing.keys())

NameError: name 'housing' is not defined

In [29]:
# Consider only first data feature
california_data = housing.data[:, :1]
california_target = housing.target
california_data.shape

NameError: name 'housing' is not defined

## Apply Linear Regression on all the three datasets given (Ungraded Exercise)

#### Create an object for Linear regression model

In [30]:
# YOUR CODE HERE
from sklearn.linear_model import LinearRegression

model = LinearRegression()

#### Perform Linear regression on Height weight data

1. Get the features and labels from the data
2. Split the data into train and test sets
3. Fit the model
2. Predict the model with test data


In [31]:
X = height_weight_data
y = height_weight_data
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


#### Perform Linear regression on Human Brain Weight and Head Size data

1. Get the features and labels from the data
2. Split the data into train and test sets
3. Fit the model
4. Predict the model with test data


In [32]:
# YOUR CODE HERE
model.fit(X_train, y_train)



#### Perform Linear regression on California data

1. Split the data into train and test sets
2. Fit the model
3. Predict the model with test data


In [33]:
# YOUR CODE HERE
y_pred = model.predict(X_test)

### Please answer the questions below to complete the experiment:




In [34]:
#@title Target value in linear regression will have continious and categorical variables ? { run: "auto", form-width: "500px", display-mode: "form" }
Answer = "TRUE" #@param ["","TRUE", "FALSE"]


In [38]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "Good, But Not Challenging for me" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [35]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "na" #@param {type:"string"}


In [36]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "Yes" #@param ["","Yes", "No"]


In [37]:
#@title  Experiment walkthrough video? { run: "auto", vertical-output: true, display-mode: "form" }
Walkthrough = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [39]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [40]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [41]:
#@title Run this cell to submit your notebook for grading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")

Your submission is successful.
Ref Id: 4343
Date of submission:  23 May 2024
Time of submission:  20:50:13
View your submissions: https://aiml-iiith.talentsprint.com/notebook_submissions
