# Introduction to the JupyterLab and Jupyter Notebooks

This is a short introduction to two of the flagship tools created by [the Jupyter Community](https://jupyter.org).

> **⚠️Experimental!⚠️**: This is an experimental interface provided by the [JupyterLite project](https://jupyterlite.readthedocs.io/en/latest/). It embeds an entire JupyterLab interface, with many popular packages for scientific computing, in your browser. There may be minor differences in behavior between JupyterLite and the JupyterLab you install locally. You may also encounter some bugs or unexpected behavior. To report any issues, or to get involved with the JupyterLite project, see [the JupyterLite repository](https://github.com/jupyterlite/jupyterlite/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc).

## JupyterLab 🧪

**JupyterLab** is a next-generation web-based user interface for Project Jupyter. It enables you to work with documents and activities such as Jupyter notebooks, text editors, terminals, and custom components in a flexible, integrated, and extensible manner. It is the interface that you're looking at right now.

**For an overview of the JupyterLab interface**, see the **JupyterLab Welcome Tour** on this page, by going to `Help -> Welcome Tour` and following the prompts.

> **See Also**: For a more in-depth tour of JupyterLab with a full environment that runs in the cloud, see [the JupyterLab introduction on Binder](https://mybinder.org/v2/gh/jupyterlab/jupyterlab-demo/HEAD?urlpath=lab/tree/demo).

## Jupyter Notebooks 📓

**Jupyter Notebooks** are a community standard for communicating and performing interactive computing. They are a document that blends computations, outputs, explanatory text, mathematics, images, and rich media representations of objects.

JupyterLab is one interface used to create and interact with Jupyter Notebooks.

**For an overview of Jupyter Notebooks**, see the **JupyterLab Welcome Tour** on this page, by going to `Help -> Notebook Tour` and following the prompts.

> **See Also**: For a more in-depth tour of Jupyter Notebooks and the Classic Jupyter Notebook interface, see [the Jupyter Notebook IPython tutorial on Binder](https://mybinder.org/v2/gh/ipython/ipython-in-depth/HEAD?urlpath=tree/binder/Index.ipynb).

## An example: visualizing data in the notebook ✨

Below is an example of a code cell. We'll visualize some simple data using two popular packages in Python. We'll use [NumPy](https://numpy.org/) to create some random data, and [Matplotlib](https://matplotlib.org) to visualize it.

Note how the code and the results of running the code are bundled together.

In [None]:
Objective:
Predict housing prices using both structured (tabular) data and real house images.

Approach:
1. Load a public dataset of US houses (metadata + images).
2. Preprocess and combine tabular features (bedrooms, bathrooms, area, zipcode) with CNN-extracted image features.
3. Train a multimodal regression model.
4. Evaluate using MAE & RMSE.
5. Provide visualizations and insights.

Skills Gained:
- Multimodal Machine Learning
- Convolutional Neural Networks (CNNs)
- Feature fusion (image + tabular)
- Regression modeling and evaluation

In [None]:
# =========================================================
# 2. Dataset Loading & Preprocessing
# =========================================================
!git clone -q https://github.com/emanhamed/Houses-dataset.git

import os, glob, math, warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import cv2
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error

import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras import layers, models

# Paths
DATA_DIR = "/content/Houses-dataset/Houses Dataset"
meta_path = os.path.join(DATA_DIR, "HousesInfo.txt")

# Load tabular data
cols = ["bed", "bath", "area", "zipcode", "price"]
df = pd.read_csv(meta_path, delim_whitespace=True, header=None, names=cols)

# Image loader: create 2x2 montage per house
def load_house_montage(idx, base_path=DATA_DIR, tile=128):
    paths = sorted(glob.glob(os.path.join(base_path, f"{idx+1}_*")))
    if len(paths) < 4:
        return None
    imgs = []
    for p in paths[:4]:
        img = cv2.imread(p)
        if img is None:
            return None
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = cv2.resize(img, (tile, tile))
        imgs.append(img)
    top = np.hstack((imgs[0], imgs[1]))
    bottom = np.hstack((imgs[2], imgs[3]))
    return np.vstack((top, bottom))

# Build image dataset
images, keep_indices = [], []
for i in range(len(df)):
    m = load_house_montage(i)
    if m is not None:
        images.append(m)
        keep_indices.append(i)
df = df.iloc[keep_indices].reset_index(drop=True)
images = np.array(images, dtype="uint8")

# Preview
plt.figure(figsize=(12,4))
for i in range(3):
    plt.subplot(1,3,i+1)
    plt.imshow(images[i])
    plt.title(f"${df['price'][i]:,.0f} | {df['bed'][i]}bd/{df['bath'][i]}ba")
    plt.axis('off')
plt.show()

In [None]:
# =========================================================
# 3. Model Development & Training
# =========================================================
# Split
X_tab = df[["bed", "bath", "area", "zipcode"]]
y = df["price"].astype(float)
X_img = images.copy()

X_tab_train, X_tab_test, X_img_train, X_img_test, y_train, y_test = train_test_split(
    X_tab, X_img, y, test_size=0.2, random_state=42
)

# Preprocess tabular
numeric_features = ["bed", "bath", "area"]
categorical_features = ["zipcode"]
tab_preproc = ColumnTransformer([
    ("num", StandardScaler(), numeric_features),
    ("cat", OneHotEncoder(handle_unknown="ignore", sparse_output=False), categorical_features),
])
X_tab_train_proc = tab_preproc.fit_transform(X_tab_train)
X_tab_test_proc = tab_preproc.transform(X_tab_test)

# Image feature extractor
IMG_INPUT = 224
def prep_images(x):
    x = tf.image.resize(x, (IMG_INPUT, IMG_INPUT))
    return preprocess_input(x)

base_cnn = MobileNetV2(weights="imagenet", include_top=False, input_shape=(IMG_INPUT, IMG_INPUT, 3))
base_cnn.trainable = False

img_input = tf.keras.Input(shape=(X_img_train.shape[1], X_img_train.shape[2], 3))
x = layers.Lambda(prep_images)(img_input)
x = base_cnn(x)
x = layers.GlobalAveragePooling2D()(x)
img_encoder = tf.keras.Model(img_input, x, name="image_encoder")

# Extract embeddings
img_train_embed = img_encoder.predict(X_img_train, verbose=0)
img_test_embed = img_encoder.predict(X_img_test, verbose=0)

# Fusion model
in_img = tf.keras.Input(shape=(img_train_embed.shape[1],))
in_tab = tf.keras.Input(shape=(X_tab_train_proc.shape[1],))
z = layers.Concatenate()([in_img, in_tab])
z = layers.Dense(256, activation="relu")(z)
z = layers.Dropout(0.3)(z)
z = layers.Dense(128, activation="relu")(z)
z = layers.Dropout(0.2)(z)
out = layers.Dense(1, activation="linear")(z)
fusion_model = tf.keras.Model([in_img, in_tab], out)

fusion_model.compile(optimizer=tf.keras.optimizers.Adam(1e-3), loss="mse", metrics=["mae"])
history = fusion_model.fit(
    [img_train_embed, X_tab_train_proc], y_train,
    validation_split=0.2,
    epochs=50, batch_size=16, verbose=1
)

In [None]:
# =========================================================
# 4. Evaluation
# =========================================================
y_pred = fusion_model.predict([img_test_embed, X_tab_test_proc]).ravel()
mae = mean_absolute_error(y_test, y_pred)
rmse = math.sqrt(mean_squared_error(y_test, y_pred))
print(f"MAE: ${mae:,.0f}")
print(f"RMSE: ${rmse:,.0f}")

# Plot Actual vs Predicted
plt.figure(figsize=(6,6))
plt.scatter(y_test, y_pred, alpha=0.7)
mn, mx = min(y_test.min(), y_pred.min()), max(y_test.max(), y_pred.max())
plt.plot([mn, mx], [mn, mx], 'r--')
plt.xlabel("Actual Price")
plt.ylabel("Predicted Price")
plt.title("Actual vs Predicted Prices")
plt.show()


In [None]:
# =========================================================
# 5. Final Summary / Insights
# =========================================================
"""
Summary:
- Successfully combined CNN image features with preprocessed tabular data.
- Achieved competitive MAE and RMSE on the test set.
- Visual analysis shows predictions closely follow actual prices.
- This approach demonstrates the power of multimodal learning in real estate price prediction.

Thanks to DeveloperHub Corporation for the opportunity and guidance, and special thanks to the mentors and team members involved.
"""

## Next steps 🏃

This is just a short introduction to JupyterLab and Jupyter Notebooks. See below for some more ways to interact with tools in the Jupyter ecosystem, and its community.

### Other notebooks in this demo

Here are some other notebooks in this demo. Each of the items below corresponds to a file or folder in the **file browser to the left**.

- [**`Lorenz.ipynb`**](Lorenz.ipynb) uses Python to demonstrate interactive visualizations and computations around the [Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system). It shows off basic Python functionality, including more visualizations, data structures, and scientific computing libraries.
- [**`r.ipynb`**](r.ipynb) demonstrates the R programming language for statistical computing and data analysis.
- [**`cpp.ipynb`**](cpp.ipynb) demonstrates the C++ programming language for scientific computing and data analysis.
- [**`sqlite.ipynb`**](sqlite.ipynb) demonstrates how an in-browser sqlite kernel to run your own SQL commands from the notebook. It uses the [jupyterlite/xeus-sqlite-kernel](https://github.com/jupyterlite/xeus-sqlite-kernel).

### Other sources of information in Jupyter

- **More on using JupyterLab**: See [the JupyterLab documentation](https://jupyterlab.readthedocs.io/en/stable/) for more thorough information about how to install and use JupyterLab.
- **More interactive demos**: See [try.jupyter.org](https://try.jupyter.org) for more interactive demos with the Jupyter ecosystem.
- **Learn more about Jupyter**: See [the Jupyter community documentation](https://docs.jupyter.org) to learn more about the project, its community and tools, and how to get involved.
- **Join our discussions**: The [Jupyter Community Forum](https://discourse.jupyter.org) is a place where many in the Jupyter community ask questions, help one another, and discuss issues around interactive computing and our ecosystem.