# Programming Interview Questions And Answers For Data Science

### 1. How do you get a list of all the keys in a dictionary?
- To get a list of all the keys in a dictionary, we use the .keys() function.
- Consider the below dictionary that has the country names as keys and their capitals as values.

#### Questions
- Get keys as a list and values as a list from a certain dictionary. 

In [82]:
dic = {'USA':'Washington D.C.', 'India':'New Delhi', 'France':'Paris', 'Spain':'Madrid', 'Germany':'Berlin'}

keys = dic.keys()
print(keys)
print(type(keys))

keys_list = list(keys)
print(keys_list)
print(type(keys_list))
print(keys_list[0])

values = dic.values()
print(values)
print(type(values))

values_list = list(values)
print(values_list)
print(type(values_list))
print(values_list[0])

dict_keys(['USA', 'India', 'France', 'Spain', 'Germany'])
<class 'dict_keys'>
['USA', 'India', 'France', 'Spain', 'Germany']
<class 'list'>
USA
dict_values(['Washington D.C.', 'New Delhi', 'Paris', 'Madrid', 'Berlin'])
<class 'dict_values'>
['Washington D.C.', 'New Delhi', 'Paris', 'Madrid', 'Berlin']
<class 'list'>
Washington D.C.


### 2. What do you mean by List Comprehension in Python? 

- List Comprehensions are a short and elegant way to define and create new lists based on existing lists.

#### Questions
- Get the a squares of a list.
- Get the evene numbers of a list.  

In [14]:
a = [1,2,3,4]
squares_of_list = [item*item for item in a]
print(squares_of_list)

even_numbers = [item for item in a if item%2 == 0]
print(even_numbers)

[1, 4, 9, 16]
[2, 4]


### 3. Given a list of numbers, find the squares of the numbers using the map function in Python.

- The map() function calls the specified function for each element of an iterable and returns a list of results.

In [15]:
l = [1,2,3,4]
squares = lambda x: x*x

mapped_list = list(map(squares, l))
print(mapped_list)

[1, 4, 9, 16]


### 4. Write a program to filter out the numbers divisible by 3 from a tuple using lambda function in another tuple.

- Lambda functions are anonymous functions that are not defined by def keyword in Python. 

In [17]:
t = (1,2,3,4,5,6,7,8,9,10)
divisible_by_3 = lambda x: x%3 == 0

filtered_tuple = tuple(filter(divisible_by_3, t))
print(filtered_tuple)

(3, 6, 9)


### 5. How to create a dataframe from a dictionary?
- To create a dataframe from dictionary, use the Pandas library and .DataFrame() function

In [28]:
import pandas as pd

dic = {'Name': ['Ammy', 'Angela', 'Samuel', 'Danny'],
       'Age': [35, 36, 30, 32],
       'City': ['NYC', 'Chicago', 'Boston', 'Seattle']    
      }

df = pd.DataFrame(dic)

df.head()

Unnamed: 0,Name,Age,City
0,Ammy,35,NYC
1,Angela,36,Chicago
2,Samuel,30,Boston
3,Danny,32,Seattle


In [29]:
type(df)

pandas.core.frame.DataFrame

### 6. What is the difference between loc() and iloc() in Python?

- **loc()** stands for location. It uses row or column labels to select and slice data from a dataframe.
- **iloc()** stands for index location. It uses integer index to select specified rows and columns.

#### Questions

1. Create a dictionary about cars and convert it to a data frame.
1. Use loc() and iloc() to select the following partitions:
    1. rows [2 to 5] for Brand and City columns.
    1. rows [2 to 4] for Years, Kms, City columns.
    1. rows that have mileage < 25 for Brand and City columns. 

In [45]:
import pandas as pd

dic = {'Brand': ['Ford', 'Hyundai', 'Tata', 'Mahindra', 'Maruti', 'Hyundai', 'Renault', 'Tata', 'kia'],
       'Year': [2016, 2012, 2011, 2010, 2009, 2010, 2012, 2017, 2019],
       'Kms': [23000, 40000, 38000, 45000, 50000, 46000, 18000, 9000, 5000],
       'City': ['Kolkata', 'Mumbai', 'Delhi', 'Bangalore', 'Mumbai', 'Bangalore', 'Chennai', 'Bangalore', 'Chennai'],
       'Mileage': [25, 27, 25, 26, 28, 29, 24, 21, 19]
      }

car = pd.DataFrame(dic)
number_of_rows = car.shape[0]
car.head(number_of_rows)

Unnamed: 0,Brand,Year,Kms,City,Mileage
0,Ford,2016,23000,Kolkata,25
1,Hyundai,2012,40000,Mumbai,27
2,Tata,2011,38000,Delhi,25
3,Mahindra,2010,45000,Bangalore,26
4,Maruti,2009,50000,Mumbai,28
5,Hyundai,2010,46000,Bangalore,29
6,Renault,2012,18000,Chennai,24
7,Tata,2017,9000,Bangalore,21
8,kia,2019,5000,Chennai,19


In [46]:
car.loc[2:5, ['Brand', 'City']]

Unnamed: 0,Brand,City
2,Tata,Delhi
3,Mahindra,Bangalore
4,Maruti,Mumbai
5,Hyundai,Bangalore


In [53]:
car.iloc[2:6, [0,3]]

Unnamed: 0,Brand,City
2,Tata,Delhi
3,Mahindra,Bangalore
4,Maruti,Mumbai
5,Hyundai,Bangalore


In [52]:
car.loc[2:4, ['Year', 'Kms', 'City']]

Unnamed: 0,Year,Kms,City
2,2011,38000,Delhi
3,2010,45000,Bangalore
4,2009,50000,Mumbai


In [50]:
car.iloc[2:5, 1:4]

Unnamed: 0,Year,Kms,City
2,2011,38000,Delhi
3,2010,45000,Bangalore
4,2009,50000,Mumbai


In [54]:
car.loc[car['Mileage']<25, ['Brand', 'City']]

Unnamed: 0,Brand,City
6,Renault,Chennai
7,Tata,Bangalore
8,kia,Chennai


In [59]:
car.iloc[(car['Mileage']<25).values, [0,3]]

Unnamed: 0,Brand,City
6,Renault,Chennai
7,Tata,Bangalore
8,kia,Chennai


### 7. How is list.append() different from list.extend()? 

- **list.append()** 
    - adds its argument as a single element to the end of a list.
    - The length of the list increases by one.

- **list.extend()**
    - iterates over its argument and adds each element to the list.
    - The length of the list increases by the number of elements in its argument.

In [65]:
l = [1,2,3]
print(l)
print(len(l))

l.append(['Hello', 3.5, True])
print(l)
print(len(l))

[1, 2, 3]
3
[1, 2, 3, ['Hello', 3.5, True]]
4


In [66]:
l = [1,2,3]
print(l)
print(len(l))

l.extend(['Hello', 3.5, True])
print(l)
print(len(l))

[1, 2, 3]
3
[1, 2, 3, 'Hello', 3.5, True]
6


### 8. Suppose you have a car dataset which has a Mileage column. Create a new field called mlg that will accept 2 values. If Mileage < 25 then "Low mlg". If Mileage >= 25, then "High mlg".

In [67]:
import pandas as pd

dic = {'Brand': ['Ford', 'Hyundai', 'Tata', 'Mahindra', 'Maruti', 'Hyundai', 'Renault', 'Tata', 'kia'],
       'Year': [2016, 2012, 2011, 2010, 2009, 2010, 2012, 2017, 2019],
       'Kms': [23000, 40000, 38000, 45000, 50000, 46000, 18000, 9000, 5000],
       'City': ['Kolkata', 'Mumbai', 'Delhi', 'Bangalore', 'Mumbai', 'Bangalore', 'Chennai', 'Bangalore', 'Chennai'],
       'Mileage': [25, 27, 25, 26, 28, 29, 24, 21, 19]
      }

car = pd.DataFrame(dic)
number_of_rows = car.shape[0]

f = lambda x: "Low mlg" if x<25 else "High mlg"
car['mlg'] = car.Mileage.apply(f)

car.head(number_of_rows)

Unnamed: 0,Brand,Year,Kms,City,Mileage,mlg
0,Ford,2016,23000,Kolkata,25,High mlg
1,Hyundai,2012,40000,Mumbai,27,High mlg
2,Tata,2011,38000,Delhi,25,High mlg
3,Mahindra,2010,45000,Bangalore,26,High mlg
4,Maruti,2009,50000,Mumbai,28,High mlg
5,Hyundai,2010,46000,Bangalore,29,High mlg
6,Renault,2012,18000,Chennai,24,Low mlg
7,Tata,2017,9000,Bangalore,21,Low mlg
8,kia,2019,5000,Chennai,19,Low mlg


### 9. Create a 4x4 matrix and find the sum of all the diagonal elements of the matrix.
- To create a matrix in Python, you can use the NumPy array (random or arange().reshape()).
- Use the diagonal() function to get the diagonal elements.

In [68]:
import numpy as np

x = np.random.randn(4,4)
print(x)

[[ 1.24344855 -0.3094567   0.72503916 -0.27084368]
 [ 0.25415375  0.98350294  0.85272108  0.01923482]
 [ 1.66807055 -0.4086649   0.22909628 -1.30198681]
 [ 0.21033903 -1.23647358 -1.11526176  0.35153177]]


In [71]:
d = np.diagonal(x)
print(d)
print(np.sum(d))

[1.24344855 0.98350294 0.22909628 0.35153177]
2.807579537270831


In [72]:
length = x.shape[0]
summation = 0

for i in range(length):
    summation += x[i,i]

print(summation)

2.807579537270831


### 10. Given a car dataframe, find the average and maximum mileage of the cars using Pandas.

In [73]:
import pandas as pd

dic = {'Brand': ['Ford', 'Hyundai', 'Tata', 'Mahindra', 'Maruti', 'Hyundai', 'Renault', 'Tata', 'kia'],
       'Year': [2016, 2012, 2011, 2010, 2009, 2010, 2012, 2017, 2019],
       'Kms': [23000, 40000, 38000, 45000, 50000, 46000, 18000, 9000, 5000],
       'City': ['Kolkata', 'Mumbai', 'Delhi', 'Bangalore', 'Mumbai', 'Bangalore', 'Chennai', 'Bangalore', 'Chennai'],
       'Mileage': [25, 27, 25, 26, 28, 29, 24, 21, 19]
      }

car = pd.DataFrame(dic)
number_of_rows = car.shape[0]

car.head(number_of_rows)

Unnamed: 0,Brand,Year,Kms,City,Mileage
0,Ford,2016,23000,Kolkata,25
1,Hyundai,2012,40000,Mumbai,27
2,Tata,2011,38000,Delhi,25
3,Mahindra,2010,45000,Bangalore,26
4,Maruti,2009,50000,Mumbai,28
5,Hyundai,2010,46000,Bangalore,29
6,Renault,2012,18000,Chennai,24
7,Tata,2017,9000,Bangalore,21
8,kia,2019,5000,Chennai,19


In [76]:
car.Mileage.max()

29

In [78]:
car.Mileage.mean()

24.88888888888889

### Given a car dataframe, find the average and maximum mileage of each type of the cars using Pandas.

In [80]:
result = car.groupby('Brand').agg({'Mileage':['mean', 'max']})
length = result.shape[0]
result.head(length)

Unnamed: 0_level_0,Mileage,Mileage
Unnamed: 0_level_1,mean,max
Brand,Unnamed: 1_level_2,Unnamed: 2_level_2
Ford,25,25
Hyundai,28,29
Mahindra,26,26
Maruti,28,28
Renault,24,24
Tata,23,25
kia,19,19
