<a href="https://colab.research.google.com/github/mobinapourmoshir/Functional-Deep-Learning/blob/main/DNN%20Basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Libraries
from sklearn.datasets import load_iris
import pandas as pd

In [2]:
# Load the Iris dataset
iris = load_iris()
print(iris.keys())

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])


The loaded dataset is a Bunch object (scikit-learn datasets), which is similar to a dictionary. It contains:

- data: The feature matrix (sepal length, sepal width, petal length, petal width)
- target: The target labels (species of iris flower: 0, 1, 2)
- feature_names: Names of the features
- target_names: Names of the target classes (setosa, versicolor, virginica)
- DESCR: A description of the dataset

To convert it to a data frame using pandas:

In [3]:
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
print(df.head())

   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
2                4.7               3.2                1.3               0.2
3                4.6               3.1                1.5               0.2
4                5.0               3.6                1.4               0.2


## Difference between Tensor data set and data loader

**Tensor:** just multi-dimensional arrays, like an extension of lists in Python.

**TensorDataset:** Imagine a data with features (input data) and labels (answers). TensorDataset pairs them together so you can access each input with its label. Combines multiple tensors (e.g., features X and labels y) into a dataset so that each sample can be accessed as a tuple (X[i], y[i]).

**DataLoader:** Provides an iterator over a dataset to feed data into a model in mini-batches. Handles batching, shuffling, and parallel loading automatically.

In [6]:
##### Tensor #####
import torch

# 1D tensor
a = torch.tensor([1, 2, 3])
print(a)
print(a.shape)  # (3,)
print("--------")
# 2D tensor
b = torch.tensor([[1, 2], [3, 4]])
print(b)
print(b.shape)  # (2, 2)

tensor([1, 2, 3])
torch.Size([3])
--------
tensor([[1, 2],
        [3, 4]])
torch.Size([2, 2])


In [10]:
##### Tensor Dataset #####
from torch.utils.data import TensorDataset

X = torch.tensor([[1,2], [3,4], [5,6]])  # features
y = torch.tensor([0, 1, 0])             # labels

dataset = TensorDataset(X, y)
print(dataset[0])

(tensor([1, 2]), tensor(0))


The DataLoader returns a different result each time because it shuffles the data and selects batches randomly during each iteration.

In [15]:
##### DataLoaders #####
from torch.utils.data import DataLoader

loader = DataLoader(dataset, batch_size=2, shuffle=True)

for batch_X, batch_y in loader:
    print("batch_X: ", batch_X)
    print("batch_y: ", batch_y)

batch_X:  tensor([[5, 6],
        [1, 2]])
batch_y:  tensor([0, 0])
batch_X:  tensor([[3, 4]])
batch_y:  tensor([1])
