<a href="https://colab.research.google.com/github/yanil-03/python/blob/main/Function_transformer_scratch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# what  is function transformer in machine learning ?

# The FunctionTransformer is a tool in scikit-learn, a popular Python library for machine learning, that allows you to apply a specified function to the input data. The FunctionTransformer can be useful for performing custom transformations of input data in a machine learning pipeline.

The FunctionTransformer takes as input a single function that will be applied to each sample in the data. This function can be any Python function that takes a single argument, such as a lambda function or a user-defined function. The function should return the transformed sample.

In [None]:
from sklearn.preprocessing import FunctionTransformer
import numpy as np

# create a dataset
X = np.array([[1, 2], [3, 4]])

# define the transformation function
log_transform = FunctionTransformer(np.log1p)

# apply the transformation to the dataset
X_transformed = log_transform.transform(X)

# view the transformed data
print(X_transformed)

[[0.69314718 1.09861229]
 [1.38629436 1.60943791]]


# types of function transformer in machine learning ?

# There are two types of FunctionTransformer available in scikit-learn:

FunctionTransformer - This transformer allows you to specify a single function that will be applied to the entire input data matrix. This transformer can be useful for feature scaling or feature extraction.

ColumnTransformer - This transformer allows you to specify a different function for each column or subset of columns in the input data matrix. This transformer can be useful for applying different transformations to different features in a dataset.

Both of these transformers are part of the scikit-learn library in Python and can be used in a machine learning pipeline to preprocess data before training a model.

# for  which condition I have to use function transformer in machine learning ?

#  We might consider using a FunctionTransformer in a machine learning pipeline in the following situations:

Custom feature engineering: If you want to engineer new features using a custom function, you can use a FunctionTransformer to apply the function to the input data matrix and create new features based on the output.

Scaling and normalization: If you want to scale or normalize the input data matrix in a custom way, you can use a FunctionTransformer to apply a custom scaling or normalization function.

Data cleaning: If you want to clean the input data matrix by removing outliers, imputing missing values, or replacing certain values, you can use a FunctionTransformer to apply a custom cleaning function.

Dimensionality reduction: If you want to reduce the dimensionality of the input data matrix by selecting a subset of features or by applying a dimensionality reduction technique such as PCA, you can use a FunctionTransformer to apply the custom function.

In general, a FunctionTransformer can be useful for any situation in which you want to apply a custom function to the input data matrix before training a machine learning model.

In [None]:
import numpy as np

In [None]:
# Practical usecases

# 1. Custom Feature Engineering

from sklearn.preprocessing import FunctionTransformer
import numpy as np

# create a dataset
X = np.array([[1, 2], [3, 4]])

# define a custom feature engineering function
def squ(X):
    return np.hstack((X, X**2))

# create a FunctionTransformer to apply the custom function
custom_transformer = FunctionTransformer(squ)

# apply the transformer to the input data
X_transformed = custom_transformer.transform(X)

# view the transformed data
print(X_transformed)


[[ 1  2  1  4]
 [ 3  4  9 16]]


In [None]:
a = np.array([1,2,3,4])
a
b = np.array([5,6,7,8])
np.hstack((a,b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [None]:
# 2. Scaling And Normalization

from sklearn.preprocessing import FunctionTransformer
import numpy as np

# create a dataset
X = np.array([[1, 2], [3, 4]])

# define a custom scaling function
def my_scaling(X):
    return X / np.max(X)

# create a FunctionTransformer to apply the custom function
custom_transformer = FunctionTransformer(my_scaling)

# apply the transformer to the input data
X_transformed = custom_transformer.transform(X)

# view the transformed data
print(X_transformed)

[[0.25 0.5 ]
 [0.75 1.  ]]


In [None]:
# 3. Data Cleaning

from sklearn.preprocessing import FunctionTransformer
import numpy as np

# create a dataset with missing values
X = np.array([[1, 2], [3, np.nan]])

# define a custom cleaning function
def my_cleaning(X):
    X[np.isnan(X)] = 0
    return X

# create a FunctionTransformer to apply the custom function
custom_transformer = FunctionTransformer(my_cleaning)

# apply the transformer to the input data
X_transformed = custom_transformer.transform(X)

# view the transformed data
print(X_transformed)


[[1. 2.]
 [3. 0.]]


# Real Life Use-Case of Function Transformer


There are many real-life use cases where FunctionTransformer can be useful in machine learning pipelines. Here are a few examples:

1. Image processing: In computer vision applications, FunctionTransformer can be used to apply custom functions to preprocess image data. For example, a custom function can be used to resize images, change the color balance, or apply filters to improve image quality.

2. Natural language processing: In NLP applications, FunctionTransformer can be used to preprocess text data by applying custom functions to perform tasks such as tokenization, stemming, or removing stop words.

3. Financial modeling: In finance, FunctionTransformer can be used to preprocess financial data by applying custom functions to transform the data, such as scaling stock prices, normalizing financial ratios, or imputing missing values.

4. Audio signal processing: In speech recognition or music analysis applications, FunctionTransformer can be used to preprocess audio data by applying custom functions to perform tasks such as filtering noise, extracting features such as MFCCs (Mel-frequency cepstral coefficients), or resampling the audio signal.

5. Sensor data processing: In Internet of Things (IoT) applications, FunctionTransformer can be used to preprocess sensor data by applying custom functions to remove outliers, impute missing values, or rescale sensor readings.

In [None]:
import numpy as np
import pandas as pd

In [None]:
df = pd.read_csv("C:\\Users\\saurabh\\Desktop\\Newdat\\placement.csv")

In [None]:
df.head(3)

Unnamed: 0,cgpa,resume_score,placed
0,8.14,6.52,1
1,6.17,5.17,0
2,8.27,8.86,1


In [None]:
x = df.drop(columns = ['placed'])
y = df['placed']

In [None]:
from sklearn.preprocessing import FunctionTransformer

In [None]:
log_transform = FunctionTransformer(np.log1p)

# apply the transformation to the dataset
X_transformed = log_transform.transform(x)


In [None]:
X_transformed

Unnamed: 0,cgpa,resume_score
0,2.212660,2.017566
1,1.969906,1.819699
2,2.226783,2.288486
3,2.064328,2.112635
4,2.142416,2.116256
...,...,...
95,1.991976,1.998774
96,2.222459,2.170196
97,2.034706,2.172476
98,2.212660,1.891605


In [None]:
# covid dataset ---> encode ---> x data ---> log transformation .