<p style="text-align:center">
    <a href="https://skills.network" target="_blank">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" width="300" alt="Skills Network Logo">
    </a>
</p>


# Test Environment for Generative AI classroom labs

This lab provides a test environment for the codes generated using the Generative AI classroom.

Follow the instructions below to set up this environment for further use.


# Setup


### Install required libraries

In case of a requirement of installing certain python libraries for use in your task, you may do so as shown below.


In [1]:
%pip install seaborn
import piplite

await piplite.install(['nbformat', 'plotly'])

### Dataset URL from the GenAI lab
Use the URL provided in the GenAI lab in the cell below. 


In [2]:
URL = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DA0101EN-Coursera/laptop_pricing_dataset_mod1.csv"

### Downloading the dataset

Execute the following code to download the dataset in to the interface.

> Please note that this step is essential in JupyterLite. If you are using a downloaded version of this notebook and running it on JupyterLabs, then you can skip this step and directly use the URL in pandas.read_csv() function to read the dataset as a dataframe


In [7]:
from pyodide.http import pyfetch

async def download(url, filename):
    response = await pyfetch(url)
    if response.status == 200:
        with open(filename, "wb") as f:
            f.write(await response.bytes())

path = URL

await download(path, "dataset.csv")
file_name  = "dataset.csv"

---


# Test Environment


In [8]:

import pandas as pd

# Path to the CSV file
file_path = "dataset.csv"

# Read the CSV into a DataFrame (first row treated as headers by default)
df = pd.read_csv(file_path)

# Identify columns that contain any missing values
cols_with_missing = df.columns[df.isna().any()].tolist()

# Output the list of columns with missing values
print("Columns with missing values:", cols_with_missing)

# Optional: display missing value counts per column
missing_counts = df.isna().sum()
print("Missing values per column:")
print(missing_counts)

Columns with missing values: ['Screen_Size_cm', 'Weight_kg']
Missing values per column:
Unnamed: 0        0
Manufacturer      0
Category          0
Screen            0
GPU               0
OS                0
CPU_core          0
Screen_Size_cm    4
CPU_frequency     0
RAM_GB            0
Storage_GB_SSD    0
Weight_kg         5
Price             0
dtype: int64


In [10]:


import pandas as pd

# df is the existing DataFrame
# 1) Replace missing values in the categorical column with its most frequent value
screen_size_mode = df['Screen_Size_cm'].mode().iloc[0]

# 2) Replace missing values in the continuous column with its mean value
weight_kg_mean = df['Weight_kg'].mean()

# Apply the replacements
df = df.fillna({
    'Screen_Size_cm': screen_size_mode,
    'Weight_kg': weight_kg_mean
})

In [11]:
df['Screen_Size_cm'] = df['Screen_Size_cm'].astype(float)
df['Weight_kg'] = df['Weight_kg'].astype(float)

In [12]:
# Convert Screen_Size_cm (cm) to Screen_Size_inch and rename the column
# Create a new column with inches and drop the original centimeter column
df['Screen_Size_inch'] = df['Screen_Size_cm'] / 2.54
df = df.drop(columns=['Screen_Size_cm'])
# Convert Weight_kg (kg) to Weight_pounds and rename the column
# Create a new column with pounds and drop the original kilogram column
df['Weight_pounds'] = df['Weight_kg'] * 2.2046226218
df = df.drop(columns=['Weight_kg'])

In [13]:

df['CPU_frequency'] /= df['CPU_frequency'].max()

In [14]:
# 1) Convert 'Screen' into indicator variables named 'Screen_<value>'
df1 = pd.get_dummies(df['Screen'], prefix='Screen')
# 2) Append the indicator variables to the original DataFrame
df = df.join(df1)
# 3) Drop the original 'Screen' column from the DataFrame
df.drop(columns=['Screen'], inplace=True)

## Authors


[Abhishek Gagneja](https://www.linkedin.com/in/abhishek-gagneja-23051987/)


## Change Log


|Date (YYYY-MM-DD)|Version|Changed By|Change Description|
|-|-|-|-|
|2023-12-10|0.1|Abhishek Gagneja|Initial Draft created|


Copyright Â© 2023 IBM Corporation. All rights reserved.
