# ColumnTransformer
A ColumnTransformer in machine learning (specifically in Python's scikit-learn) is a tool used to apply different preprocessing steps to different columns of your dataset.
### Why it's useful:
- In real-world datasets, different types of features require different preprocessing. For example:
- Numeric features might need scaling (e.g., StandardScaler).
- Categorical features might need encoding (e.g., OneHotEncoder).
Doing this manually for each column is inefficient. ColumnTransformer automates this process.

In [1]:
df = pd.read_csv('people_data_with_target.csv')
df.head()

   Age    Gender EducationLevel          City   Income  HighIncome
0   56     Male     High School   Los Angeles  102762          1
1   46     Male     Bachelors     Houston     100020          1
2   32     Male     Masters       New York     77310          0
3   60     Male     PhD           Los Angeles  38405          0
4   25     Male     Bachelors     Chicago      58522          0


In [2]:
df.describe()

             Age          Income  HighIncome
count   500.000000   500.000000   500.000000
mean     41.278000   71212.558000   0.342000
std      13.389072   20784.357982   0.474855
min      18.000000   17803.000000   0.000000
25%      30.000000   57747.250000   0.000000
50%      42.000000   71026.000000   0.000000
75%      52.000000   85387.000000   1.000000
max      64.000000   132097.000000  1.000000


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 6 columns):
Age              500 non-null int64
Gender           500 non-null object
EducationLevel   470 non-null object
City             500 non-null object
Income           500 non-null int64
HighIncome       500 non-null int64
dtypes: int64(3), object(3)
memory usage: 23.6+ KB

In [4]:
df.isnull().sum()

Age                0
Gender             0
EducationLevel    30
City               0
Income             0
HighIncome         0
dtype: int64

In [5]:
df = df.drop(columns=['Income'])