Skip to content

devil-cyber/Handling-Categorical-Variable

Repository files navigation

Categorical Variable handling

  • This is a package that contain some powerful Categorical feature handling algorithm :
  • For that you have to install this library using pip command
  • Then observe you dataset and figure out which encoding technique you wants to used
  • After that call that library and do as follows:
Download package : pip install categorical-variable-handling==0.1
from categorical_variable _handling import CategoricalFeature
column="Give here the list of Categorical column"
dataframe="Here give the dataframe"
encoding_type="Here give the encoding type e.g:'label','binary','ohe':one hot encoder"
handle_na="this is a boolean value if you wants to handle the NaN value then make true else make false "

cat = CategoricalFeature(df, categorical_features=c, encoding_type='ohe', handle_na=True)
cat.fit_transform() #This will return the transformed value of the dataset
For  a input like this:
    m     n     p
0 india  1000  rav1
1 us     2000  mani
2 japan  3000  ravi
3 us     4000  shri
4 japan  5000  shri


Output will look like this:
      n  m__bin_0  m__bin_1  m__bin_2  p__bin_0  p__bin_1  p__bin_2  p__bin_3
0  1000         1         0         0         0         1         0         0
1  2000         0         0         1         1         0         0         0
2  3000         0         1         0         0         0         1         0
3  4000         0         0         1         0         0         0         1
4  5000         0         1         0         0         0         0         1


By. Manikant Kumar