<a href="https://colab.research.google.com/github/liangli217/PyTorch_ML/blob/main/PyTorch_Basics_%26_Feature_Scaling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Problems:

1. Reshape an M * N tensor into a (M * N//2)*2 tensor
2. Find the average of every column in a tensor
3. Combine an M * N tensor and a M * M tensor into a M * (M_+N) tensor

In [9]:
import torch
import torch.nn

# Helpful functions:
# https://pytorch.org/docs/stable/generated/torch.reshape.html
# https://pytorch.org/docs/stable/generated/torch.mean.html
# https://pytorch.org/docs/stable/generated/torch.cat.html
# https://pytorch.org/docs/stable/generated/torch.nn.functional.mse_loss.html

# Round your answers to 4 decimal places using torch.round(input_tensor, decimals = 4)

def reshape(to_reshape):
  # torch.reshape() will be useful - check out the documentation

  row_num = to_reshape.shape[0]
  col_num = to_reshape.shape[1]

  new_row_num = row_num*col_num//2
  return torch.reshape(to_reshape, (new_row_num, 2))

def average(to_avg) :
  # torch.mean() will be useful - check out the documentation
  return torch.mean(to_avg, dim = 0)

def concatenate(cat_one, cat_two):
  # torch.cat() will be useful - check out the documentation
  return torch.cat((cat_two, cat_one), dim = 1)

def get_loss(prediction, target):
  # torch.nn.functional.mse_loss() will be useful - check out the documentation
  return torch.nn.functional.mse_loss(prediction, target)


In [15]:
to_reshape=torch.tensor([
  [0.8088, 1.2614, -1.4371],
  [-0.0056, -0.2050, -0.7201]
])

In [16]:
to_avg = average(to_reshape)
print(to_avg)

tensor([ 0.4016,  0.5282, -1.0786])


In [17]:
cat_one = torch.tensor([
  [1.0, 1.0, 1.0],
  [1.0, 1.0, 1.0]
])

cat_two = torch.tensor([
  [1.0, 1.0],
  [1.0, 1.0]
])

In [18]:
concatenate(cat_one, cat_two)

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

Write a Python function that performs feature scaling on a dataset using both standardization and min-max normalization. The function should take a 2D NumPy array as input, where each row represents a data sample and each column represents a feature. It should return two 2D NumPy arrays: one scaled by standardization and one by min-max normalization. Make sure all results are rounded to the nearest 4th decimal.

In [48]:
import torch
import numpy as np

def feature_scaling(data) -> tuple[torch.Tensor, torch.Tensor]:
    """
    Standardize and Min-Max normalize input data using PyTorch.
    Input: Tensor or convertible of shape (m,n).
    Returns (standardized_data, normalized_data), both rounded to 4 decimals.
    """
    data_t = torch.as_tensor(data, dtype=torch.float)


    # Your implementation here
    data_mean = torch.mean(data_t, 0)
    data_sd = torch.std(data_t, 0,unbiased=False)
    #min-max
    data_min = torch.min(data_t, 0).values
    data_max = torch.max(data_t,0).values

    standardized_data = (data_t - data_mean)/data_sd


    normalized_data = (data_t - data_min)/(data_max - data_min)


    return (standardized_data,normalized_data)





In [49]:
data = torch.tensor(np.array([[1, 2], [3, 4], [5, 6]]))

In [50]:
feature_scaling(data)

(tensor([[-1.2247, -1.2247],
         [ 0.0000,  0.0000],
         [ 1.2247,  1.2247]]),
 tensor([[0.0000, 0.0000],
         [0.5000, 0.5000],
         [1.0000, 1.0000]]))