## Python Machine learning Interview Questions

Every day, 10 Python and data-related problems are uploaded to help improve your skills for roles like Data Analyst and Data Scientist.

## 1. Even-Odd Split with Mean
Given a list, split into even and odd numbers. Print their mean separately.
Input: [1, 2, 3, 4, 5, 6] → Output: Even Mean: 4.0, Odd Mean: 3.0

In [1]:
import statistics
my_list = [1, 2, 3, 4, 5, 6]
even = [i for i in my_list if i%2 == 0]
statistics.mean(even)

4

In [2]:
odd = [i for i in my_list if i%2 != 0]
statistics.mean(odd)

3

## 2. Mode Finder
Write a function that finds the mode (most frequent element) in a list.
Hint: Use a dictionary or collections.Counter.

In [3]:
def mode(lst):
    max_count = 0
    mode_value = None
    
    for i in lst:
        count = lst.count(i)  
        if count > max_count:
            max_count = count
            mode_value = i    

    return mode_value


In [4]:
nums = [1, 1, 5, 7,7,7,8]
print(mode(nums)) 

7


## 3. GroupBy Aggregation
df = pd.DataFrame({
    'Department': ['HR', 'Finance', 'HR', 'IT', 'Finance'],
    'Salary': [30000, 50000, 32000, 45000, 52000]
})
Use groupby to compute average salary by department.

In [5]:
import pandas as pd
df = pd.DataFrame({
    'Department': ['HR', 'Finance', 'HR', 'IT', 'Finance'],
    'Salary': [30000, 50000, 32000, 45000, 52000]
})

In [6]:
df.groupby('Department')['Salary'].mean()

Department
Finance    51000.0
HR         31000.0
IT         45000.0
Name: Salary, dtype: float64

## 4. Binning Ages
Given an age column, create bins like "Youth (<=25)", "Adult (26-60)", "Senior (>60)"
Hint: Use pd.cut() and add a new column Age_Group

In [7]:
df = pd.DataFrame({
    'Department': ['HR', 'Finance', 'HR', 'IT', 'Finance'],
    'Salary': [30000, 50000, 32000, 45000, 52000],
    'Age': [21, 35, 65, 28, 56]
})

In [8]:
df['Age_Group'] = pd.cut(df['Age'],bins = [0,25,60,100],labels = ['Youth','Adult','Senior'],right= True)

In [9]:
df['Age_Group']

0     Youth
1     Adult
2    Senior
3     Adult
4     Adult
Name: Age_Group, dtype: category
Categories (3, object): ['Youth' < 'Adult' < 'Senior']

## 5. Z-Score Normalization
Write a function to standardize a list using Z-score:
z = (x - mean) / std_dev

In [10]:
df['normalised_age'] = (df['Age'] - df['Age'].mean())/df['Age'].std()
df['normalised_age']

0   -1.066761
1   -0.320028
2    1.280114
3   -0.693395
4    0.800071
Name: normalised_age, dtype: float64

## 6. What is the output?
print(2 ** 3 ** 2)
a) 512
b) 64
c) 16
d) Error

In [11]:
print(2 ** 3 ** 2)
#guessed 64 at first. But for exponentiation Python follows Right to left associativity. So it is 2 to the power 9 = 512

512


## 7. Output of this code?
x = [[], False, 0, '', 1]
print(list(filter(bool, x)))
a) [[], False, 0, '', 1]
b) [1]
c) [[], 1]
d) []

In [12]:
x = [[], False, 0, '', 1]

In [13]:
list(filter(bool, x))

[1]

## 8. Predict the output:
print([[x*y for x in range(1, 4)] for y in range(1, 3)])
a) [[1, 2, 3], [1, 2, 3]]
b) [[1, 2, 3], [2, 4, 6]]
c) [[1, 2, 3], [3, 6, 9]]
d) Error

In [14]:
print([[x*y for x in range(1, 4)] for y in range(1, 3)]) 

[[1, 2, 3], [2, 4, 6]]


## 9. Find the output:
for i in range(3):
    if i == 1:
        continue
    print(i)
else:
    print("Done")
a) 0 2 Done
b) 0 1 2 Done
c) 0 2
d) 0 2 else

In [15]:
for i in range(3):
    if i == 1:
        continue
    print(i)
else:
    print("Done")

0
2
Done


In [16]:
#The else block in a for loop executes only after the loop completes all its iterations without hitting a break.
#In this case, since the loop completes normally the else block is executed.

## 10. Find the output:
a = "Python"
print(a[::-1])
a) Python
b) nohtyP
c) Error
d) nthyoP

In [17]:
a = "Python" 
print(a[::-1])

nohtyP


## 11. Extra: Add a Segment Score column
data = {'Customer': ['A', 'B', 'C', 'D'],
        'Age': [25, 45, 35, 23],
        'Annual Income': [50000, 100000, 75000, 48000]}
df = pd.DataFrame(data)

Write code to:
Normalize 'Annual Income'
Bin Age into 3 categories (Youth, Adult, Senior)
Add a Segment Score column as: AgeBin_Label + Income_Normalized_Value

In [18]:
data = {'Customer': ['A', 'B', 'C', 'D'],
        'Age': [25, 45, 35, 23],
        'Annual Income': [50000, 100000, 75000, 48000]}
df = pd.DataFrame(data)


In [19]:
df['Income_Normalized_Value'] = (df['Annual Income']-df['Annual Income'].mean())/df['Annual Income'].std()

In [20]:
df['AgeBin_Label'] = pd.cut(df['Age'],bins = [0,25,45,60],labels = ['Youth', 'Adult', 'Senior'],right = True)

In [21]:
df['Score'] = df['AgeBin_Label'].astype(str)+ df['Income_Normalized_Value'].round(2).astype(str)

In [22]:
df['Score']

0    Youth-0.75
1      Adult1.3
2     Adult0.28
3    Youth-0.83
Name: Score, dtype: object