# 51 Python Interview Questions for Data Scientist

## 1. What is the difference between list and tuple

This question comes up very often in every Python / Data Science interview. Make sure to remember the correct answer.

- **Lists are mutable.** They can be changed after creation.

- **Tuples are immutable.** Once a tuple is created, it cannot be modified.

- **Lists are ordered.** They represent an ordered sequence of objects of the same type.  
  Example: all user names ordered by creation date  
  `["Seth", "Ema", "Eli"]`

- **Tuples have structure.** Different data types can exist at each index.  
  Example: a memory record from a database  
  `(2, "Ema", "2020-04-16")  # id, name, created_at`


## 2. How is string interpolation performed?

**Interpolation** is the "insertion" of variable values (or complex expressions) directly into a text string. In Python, even without additional imports, there are three main ways to do this:

| Method         | Example                       | When to use                            |
|----------------|-------------------------------|----------------------------------------|
| **f-strings**  | `f"Hello {name}"`             | Python 3.6+, fastest and most readable |
| **`%` operator** | `"Hi %s" % name`             | Older/legacy code, compatibility       |
| **`.format()`** | `"My name is {}".format(name)` | Before Python 3.6 or if `.format()` is already widely used |

An alternative is to use the `Template` class from the `string` module.


In [18]:
name = 'George'

In [19]:
text1 = f'My name is {name}'
print(text)

My name is George


In [20]:
text2 = 'My name is {name}'.format(name=name)
print(text)

My name is George


In [21]:
text3 = 'My name is %s' % name
print(text)

My name is George


In [27]:
number = 3.145634
print(f'The number is {number:,.4f}')

The number is 3.1456


## 3. What is the d ifference between “is” and “==”?

**`==`** checks for **equality of values**
**`is`** checks for **identical objects (same memory address)**

Example:

```python
a = [1, 2, 3]  
b = a  
c = [1, 2, 3]  

print(a == b)  # True  
print(a == c)  # True  

print(a is b)  # True  
print(a is c)  # False  

print(id(a))  
print(id(b))  # same as a
print(id(c))  # different
```

## 4. What is a decorator?

A decorator allows you to add functionality to an existing function by passing that existing function to a decorator, which will execute it and additional code. It's kind of like an add-on to an existing function.

```python
def simple_wrapper(func):
    def inner():
        print("→ До вызова")
        func()
        print("← После вызова")
    return inner

@simple_wrapper
def greet():
    print("Привет!")

greet()

#→ До вызова
#Привет!
#← После вызова
```

## 5. Explain how the range function works

Range generates a list of integers and can be used in `three ways`.

The function accepts `from 1 to 3 arguments`. Let us consider three cases - an entry with one, two and three arguments.

1. **range(stop)**: generates integers from 0 to the integer “stop”
```python
list(range(10)) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 8, 9]
```
2. **range(start, stop)**: generates integers from “start” to “stop”.
```python
list(range(2,10)) # [2, 3, 4, 5, 6, 7, 8, 8, 9]
```
3. **range(start, stop, step)**: generates integers from “start” to ‘stop’ with “step” intervals.
```python
list(range(2,10)) # [2, 4, 6, 8]
```
In all cases, stop (upper bound) is not included in the list.

## 6. Define a class named car and with 2 properties, “color” and “speed”. Then create an instance and return the speed property

In [99]:
class car():
    def __init__(self, color, speed):
        self.color = color
        self.speed = speed

MyCar = car('Black', 150)
print(MyCar.color, MyCar.speed, sep = ', ')

Black, 150


## 7. What is the difference between class method, instance method and static method in Python?

Instance methods: take the self parameter and refer to a specific instance of the class.

Static methods: use the @staticmethod decorator for methods that are not associated with specific instances and are self-contained (do not change class or instance properties).

Class methods: accept the cls parameter and can modify the class itself.

Let's illustrate the difference around the fictional CoffeeShop class.

```python

class CoffeeShop:
 specialty = 'espresso'
    
    def __init__(self, coffee_price):
 self.coffee_price = coffee_price
    
    # instance method
 def make_coffee(self):
 print(f'Making {self.specialty} for ${self.coffee_price}')
    
    # static method 
 @staticmethod
 def check_weather():
 print('Its sunny')

    # class method
 @classmethod
 def change_specialty(cls, specialty):
 cls.specialty = specialty
 print(f'Specialty changed to {specialty}')
```

CoffeeShop The class has a specialty property set to the default value ‘espresso’. Each CoffeeShop instance is initialized with the coffee_price property. It also has 3 methods: an instance method, a static method and a class method.

Let's initialize a coffee shop instance with coffee_price = 5. Then let's call the make_coffee instance method.

```python
coffee_shop = CoffeeShop('5')
coffee_shop.make_coffee()
#=> Making espresso for $5
```

Now let's call a static method. Static methods cannot change the class or instance state, so they are usually used for service functions. We'll use our static method to check the weather. Its sunny.

```python
coffee_shop.check_weather()
#=> Its sunny
```

Now let's use the class method to change the coffee's appearance, and then call the make_coffee method.

```python
coffee_shop.change_specialty('drip coffee')
#=> Specialty changed to drip coffee

coffee_shop.make_coffee()
#=> Making drip coffee for $5
```

Notice how the make_coffee method used to make espresso, but now makes drip coffee!

## 8. What is the difference between “func” and “func()”?

The purpose of this question is to see if you know that all functions are also objects in Python.

```python
def func():
    print('Im a function')
    
func
#=> function __main__.func>

func()    
#=> Im a function
```

func is an object representing a function that can be assigned to a variable or passed to another function. A func() with parentheses calls a function and returns what is in return.

## 9. Explain how the map() function works

The `map()` function in Python applies a given function to each item in an iterable (like a list, tuple, etc.) and returns an iterator of the results. It’s a concise way to transform data without using explicit loops.

```python
map(function, iterable, ...) # syntax
```

Example:
```python
# Double each number in a list
numbers = [1, 2, 3, 4]
result = map(lambda x: x * 2, numbers)
print(list(result))  # Output: [2, 4, 6, 8]
```

## 10. Explain how the reduce() function works

This function is quite hard to understand immediately until you use it several times.

`reduce` takes a function and a sequence and iterates through that sequence. At each iteration, the current element and the output of the previous element are passed to the function. At the end, a single value is returned.

```python
from functools import reduce

def add_three(x,y):
 return x + y

li = [1,2,3,5]

reduce(add_three, li)
#=> 11
```

## 11. Explain how the filter() function works

The `filter()` function literally does what its name says: It filters the elements in a sequence/list.

The `filter(f, iterable)` function tries all the elements of iterable, passes each element to the f function, and leaves only those for which f(element) returns True.

```python
def add_three(x):
 if x % 2 == 0:
 return True 
 else:
 return False
li = [1,2,3,4,5,6,7,8]
new_listdata = [i for i in filter(add_three, li)]
print(new_listdata)
#=> [2, 4, 6, 8]
```
Note that all elements not divisible by 2 have been removed.

## 12. Does python call by reference or by value?

Python always passes a **reference to the object**, not a copy and not the variable itself.

- If the object is **immutable** (e.g., `int`, `str`, `tuple`), any change inside the function creates a **new object**.  
  The original object remains unchanged outside the function.
- If the object is **mutable** (e.g., `list`, `dict`, `set`), and it’s modified **in place** inside the function,  
  the changes are **reflected outside** the function.

Immutable Example (string):
```python
name = 'chr'
def add_chars(s):
    s += 'is'
    print(s)
    
add_chars(name)    
print(name)
#=> chris
#=> chr
```
Mutable Example (list):
```python
li = [1,2]
def add_element(seq):
    seq.append(3)
    print(seq)
    
add_element(li)    
print(li)
#=> [1, 2, 3]
#=> [1, 2, 3]
```

## 13. How to reverse the list?

In order to reverse a list, you must call `reverse()`, which is a method of the list. The function changes the list itself.

```python
li = ['a','b','c']
print(li)
li.reverse()
print(li)
#=> ['a', 'b', 'c']
#=> ['c', 'b', 'a']
```

Alternative:

```python
my_list = [1, 2, 3, 4, 5]
my_list[::-1]
```

## 14. How does string multiplication work?

Let's see the results of multiplying the string ``‘cat’`'' by 3.

```python
'cat' * 3
# => "catcatcat

## 15. How does list multiplication work?

Let's see the result of multiplying the list `[1, 2, 3]` by 2.

```python
[1, 2, 3] * 2
# => [1, 2, 3, 1, 2, 3]

## 16. What does “self” mean in a class?

Self refers to an instance of the class itself. This is how we give methods access and the ability to update the object they belong to.

Below, passing self to __init__() gives you the ability to set the colour for the instance at initialisation.

```python
class Shirt:
    def __init__(self, color):
        self.color = color
        
s = Shirt('yellow')
s.color
#=> 'yellow'
```

## 17. How to merge lists in Python?

**The sum of 2 lists combines them**:

```python
a = [1,2]
b = [3,4,5]
c = a + b
print(c)
print(a + b)
#=> [1, 2, 3, 4, 5]
#=> [1, 2, 3, 4, 5]
```
**Method `.extend()`**:

Adds the elements of one list to the end of another. Modifies the original list.

```python
a = [1, 2]
b = [3, 4]
a.extend(b)
print(a) # [1, 2, 3, 4]
```

**Rack lists `[*a, *b]`**:

```python
a = [1, 2]
b = [3, 4]
c = [*a, *b]
print(c) # [1, 2, 3, 4]
```

## 18. What is the difference between shallow copy vs. deep copy?

We'll discuss this in the context of a mutable object - a list. For immutable objects, shallow copy vs deep copy is not as relevant.

We will go through 3 scenarios

1) Reference to the original object. It points to the new name, li2 points to the same memory location pointed to by li1. Thus, any change we make in li1 also happens in li2

```python
li1 = [['a'],['b'],['c']]
li2 = li1
li1.append(['d'])
print(li2)
#=> [[‘a’], [‘b’], [‘c’], [‘d’]]
```

2) Create a shallow copy of the original. We can do this with the list() constructor or the more pythonic mylist.copy().
The shallow copy creates a new object, but populates it with references to the original. Thus, adding a new object to the original li3 collection does not propagate to li4, but changing one of the li3 objects will propagate to li4.

```python
li3 = [['a'],['b'],['c']]
li4 = list(li3)
li3.append([4])
print(li4)
#=> [['a'], ['b'], ['c']]
li3[0][0] = ['X']
print(li4)
#=> [[[‘X’]], [‘b’], [‘c’]]
```

3) Create a deep copy. This is done with copy.deepcopy(). The 2 objects are now completely independent, and changes to one of them do not affect the other list.

```python
import copy
li5 = [['a'],['b'],['c']]
li6 = copy.deepcopy(li5)
li5.append([4])
li5[0][0] = ['X']
print(li6)
#=> [[‘a’], [‘b’], [‘c’]]
```

## 19. What is the difference between lists and arrays?

**Note**: Python includes a built-in array type (array), but in this case by "array" I mean the numpy.array type, which is much more commonly used.

`Lists` exist in the Python standard library. `Arrays` are defined by Numpy.

Lists can be populated with different data types at each index. Arrays require homogeneous elements.

Arithmetic on lists adds or removes elements from the list. Arithmetic on arrays is a linear algebra function.

```python
a = np.array([1, 2, 3])
b = np.array([-1, -2, -3])
c = [1, 2, 3]
d = [-1, -2, -3]

a + b # -> array([0, 0, 0, 0])
c + d # -> [1, 2, 3, -1, -2, -3]
```

Arrays also use less memory and have much more functionality.

## 20. How to join two Numpy arrays?

Arrays are not lists. Arrays in Numpy cannot be added together like lists, because the arithmetic functions go into the linear algebra category (matrix addition).

To combine arrays, we need to use Numpy's chaining function.

```python
import numpy as np
a = np.array([1,2,3])
b = np.array([4,5,6])
np.concatenate((a,b))
#=> array([1, 2, 3, 4, 5, 6])
```

## 21. Name the mutable and immutable objects in Python

**Immutable objects** - means that the state cannot be changed after creation. Examples are `int`, `float`, `bool`, `string`, and `tuple`.

**Mutable Object** - means that the state can be changed after creation. Examples are `list`, `dict` and `set`.

## 22. How to round a number to 3 decimal places in Python?

Use the round(value, decimal_places) function:

```python
a = 5.12345
round(a,3)
#=> 5.123
```

## 23. How to output a slice of a list - slice a list?

The **slice** syntax is as follows, you need to **list** to pass 3 arguments, `list[start:stop:step]`, where step is the interval over which the elements are returned.

```python
a = [0,1,2,3,4,5,6,7,8,9]
print(a[:2])
#=> [0, 1]
print(a[8:])
#=> [8, 9]
print(a[2:8])
#=> [2, 3, 4, 5, 6, 7]
print(a[2:8:2])
#=> [2, 4, 6]
```

## 24. What is pickling (pickling)?

**Pickling** is a method for serializing (converting an object to bytes) and deserializing (restoring an object from bytes) in Python.
Pickle is used when you want to save Python objects to a file or transfer them over a network, so that you can restore exactly the same object later.

In the example below, we serialize and deserialize a list of dictionaries:

```python
import pickle
obj = [
    {'id':1, 'name':'Stuffy'},
    {'id':2, 'name': 'Fluffy'}
]
with open('file.p', 'wb') as f:
    pickle.dump(obj, f)
with open('file.p', 'rb') as f:
    loaded_obj = pickle.load(f)
print(loaded_obj)
#=> [{'id': 1, 'name': 'Stuffy'}, {'id': 2, 'name': 'Fluffy'}]
```

## 25. What is the difference between dictionaries and JSON?

**Dictionary** is a Python data type, a set of indexed but unordered keys and values.

**JSON** is simply a string that follows a given format and is intended to convey data.

```python
import json

# Python Dictionary
data_dict = {
 "user": "Anna",
 "active": True,
 "roles": ["admin", "editor"],
 42: None # allowed in dict, but not in JSON
}

# Serialize to JSON string
json_str = json.dumps(data_dict)
json_str # → '{"user": "Anna", "active": true, "roles": [‘admin’, ‘editor’], ‘42’: null}'

## 26. What ORMs have you used in Python?

**What is ORM**:
`ORM` (Object-Relational Mapping, Object-Relational Mapping) is a layer between your application objects and database tables.

It:
- automatically maps (mappings) classes and their attributes to rows and columns in the database;
- generates SQL queries for you and wraps them in Python methods;
- takes care of the details of transactions, lazy loading, migrations, etc.

**Examples in the Python ecosystem**:
- `SQLAlchemy` is the most popular “generic” ORM. It is often used in conjunction with (but not dependent on) Flask and supports both declarative-style and pure Core-SQL APIs.
- `Django ORM` - is built right into the Django framework. It follows an “active record” approach, where each model-class knows how to save itself to the base.
- `Other`: Peewee, PonyORM, Tortoise ORM, GINO for async applications, etc.

## 27. How do any() and all() work?

`Any()` takes a sequence and returns true **if any** element in the sequence is true.

`All()` returns true only **if all** elements in the sequence are true.

```python
a = [False, False, False]
b = [True, False, False]
c = [True, True, True]
print( any(a) )
print( any(b) )
print( any(c) )
#=> False
#=> True
#=> True
print( all(a) )
print( all(b) )
print( all(c) )
#=> False
#=> False
#=> True
```

## 28. Which is faster for finding an item - a dictionary or a list in Python?

Finding a value in a list takes `O(n)` time because the entire list must be searched until the value is found.

Finding a key in a dictionary takes `O(1)` time because it is a `hash table'.

This can make a huge difference in time if there are many values, so dictionaries are usually recommended for speed. But they have other limitations, such as the need for unique keys.

## 29. What is the difference between a module and a package?

**Module** is a file (or set of files) that can be imported together.

```python
import math
```
**Package** is a catalog of modules

```python
from sklearn import cross_validation
```

## 30. How to increment and decrement an integer in Python?

Increasing and decreasing can be done with += and -=:

```python
value = 5
value += 1
print(value)
#=> 6
value -= 1
value -= 1
print(value)
#=> 4
```

## 31. How do you convert an integer to a binary number?

Use bin() function:

```python
bin(5)
#=> '0b101'
```

## 32. How to remove duplicate items from the list?

Removing duplicate items from a list can be done by converting the list to a set and then back to a list.

```python
a = [1,1,1,2,3]
a = list(set(a))
print(a)
#=> [1, 2, 3]
```

## 33. How to check if a value exists in the list?

Use `in`:
```python
'a' in ['a','b','c']
#=> True
'a' in [1,2,3]
#=> False
```

## 34. What is the difference between an addendum and an extension?

`append` appends a value to the list, and `extend` appends values from another list to the list:

```python
a = [1,2,3]
b = [1,2,3]
a.append(6)
print(a)
#=> [1, 2, 3, 6]
b.extend([4,5])
print(b)
#=> [1, 2, 3, 4, 5]
```

## 35. How to get the absolute value (modulus) of an integer?

Use `abs` functions:

```python
abs(2)
#=> 2
abs(-2)
#=> 2
```

## 36. How to merge two lists (2 lists) into a list of tuples?

You can use the zip() function to merge lists into a tuple list. The zip() function is not limited to merging only 2 lists. zip() can also be called with 3 or more lists.

```python
a = ['a','b','c']
b = [1,2,3]
[(k,v) for k,v in zip(a,b)]
#=> [('a', 1), ('b', 2), ('c', 3)]
```

## 37. How to sort the dictionary by key in alphabetical order?

You can't “sort” a dictionary because dictionaries have no order, but you can return a sorted list of tuples that has the keys and values that are in the dictionary.

```python
d = {'c':3, 'd':4, 'b':2, 'a':1}
sorted(d.items())
#=> [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
```

## 38. How does a class inherit from another class in Python?

In the example below, Audi inherits from Car. And with this inheritance come the methods of an instance of the parent class (i.e. from the Car class to the Audi class).

```python
class Car():
    def drive(self):
        print('vroom')
class Audi(Car):
    pass
audi = Audi()
audi.drive()
```

## 39. How can you remove all spaces from a string?

Use `replace` function:

```python
s = 'A string with     white space'
s.replace(' ', '')
#=> 'Astringwithwhitespace'
```
Alternative:
```python
''.join(s.split())
#=> 'Astringwithwhitespace'
```

## 40. Why should enumerate() be used to iterate a sequence?

The `enumerate()` function keeps track of the index as you iterate through the sequence. This is more pythonic than defining and incrementing an integer representing the index.

```python
li = ['a','b','c','d','e']
for idx,val in enumerate(li):
    print(idx,val)
#=> 0 a
#=> 1 b
#=> 2 c
#=> 3 d
#=> 4 e
```

## 41.What is the difference between pass, continue и break?

`pass` means to do nothing. It is usually used because Python does not allow you to create a class, function, or if statement without code inside.
The example below will throw an error without code inside i > 3, so pass should be used

```python
a = [1,2,3,4,5]
for i in a:
    if i > 3:
        pass
    print(i)
#=> 1
#=> 2
#=> 3
#=> 4
#=> 5
```
`continue` - moves the loop to the next element and stops loop execution for the current element. So print(i) will never be executed for values where i < 3.

```python
for i in a:
    if i < 3:
        continue
    print(i)
#=> 3
#=> 4
#=> 5
```
`break` — stops the loop and the sequence is no longer repeated. Thus, elements i with value 3 are not output.

```python
for i in a:
    if i == 3:
        break
    print(i)    
#=> 1
#=> 2
```

## 42. How do I translate a loop into a single line entry for a list?

Consider a for loop as it was before:
```python
a = [1,2,3,4,5]
 
a2 = []
for i in a:
     a2.append(i + 1)
print(a2)
#=> [2, 3, 4, 5, 6]
```

How you can transform:

```python
a3 = [i+1 for i in a]
print(a3)
#=> [2, 3, 4, 5, 6]
```

## 43. Give an example of a ternary operator

A ternary operator is a one-line if / else operator.
The syntax is ‘a if condition else b’.
```python
x = 5
y = 10
'greater' if x > 6 else 'less'
#=> 'less'
'greater' if y > 6 else 'less'
#=> 'greater'
```

## 44. How to check that the string contains only digits?

Use the `isnumeric()` or `isdigit()` functions:

| Method        | Checks for...                                | Example            | Result  |
|---------------|-----------------------------------------------|--------------------|---------|
| `isdigit()`   | Only digits (0–9 and some Unicode digits)     | `'123'.isdigit()`  | `True`  |
|               |                                               | `'²'.isdigit()`     | `True`  |
|               |                                               | `'Ⅷ'.isdigit()`     | `False` |
| `isnumeric()` | All numeric characters (digits, fractions, Roman numerals, etc.) | `'123'.isnumeric()` | `True`  |
|               |                                               | `'Ⅷ'.isnumeric()`   | `True`  |
|               |                                               | `'¼'.isnumeric()` | `True` |

**Summary:**  
- Use `isdigit()` for standard digits.  
- Use `isnumeric()` when you want to accept a broader range of numeric Unicode characters.


## 45. How to check that a string contains only letters?

Use `alpha()` function:

```python
'123a'.isalpha()
#=> False
'a'.isalpha()
#=> True
```

## 46. How to check that the string contains only numbers and letters?

Use `isalnum()` function:

```python
'123abc...'.isalnum()
#=> False
'123abc'.isalnum()
#=> True
```

## 47. How to return the list of keys from the dictionary?

To return a list of keys from a dictionary in Python, use the `list()`:

```python
my_dict = {'a': 1, 'b': 2, 'c': 3}
list(my_dict) # list(my_dict) == list(my_dict.keys())
# ['a', 'b', 'c']
```

## 48. How do you convert a string to upper and lower case?

You can use the string methods `upper()` and `lower()` for case conversion

```python
small_word = 'potatocake'
big_word = 'FISHCAKE'
small_word.upper()
#=> 'POTATOCAKE'
big_word.lower()
#=> 'fishcake'
```

## 49. What is the difference between remove, del and pop?

`remove()` removes the first matching value:

```python
li = ['a','b','c','d']
li.remove('b')
li
#=> ['a', 'c', 'd']
```

`del()` deletes an element by index:

```python
li = ['a','b','c','d']
del li[0]
li
#=> ['b', 'c', 'd']
```

`pop()` removes an element by index and returns that element:

```python
li = ['a','b','c','d']
li.pop(2)
#=> 'c'
li
#=> ['a', 'b', 'd']
```

## 50. Give an example of dictionary comprehension

Below we will create a dictionary with letters of the alphabet as keys and an index into the alphabet as values:

```python
# creating a list of letters
import string
list(string.ascii_lowercase)
alphabet = list(string.ascii_lowercase)
# list comprehension
d = {val:idx for idx,val in enumerate(alphabet)} 
d
#=> {'a': 0,
#=>  'b': 1,
#=>  'c': 2,
#=> ...
#=>  'x': 23,
#=>  'y': 24,
#=>  'z': 25}
```

## 51. How is exception handling done in Python?

Python provides 3 words for handling try, except and finally exceptions.
The syntax is as follows:

```python
try:
    # try to do this
except:
    # if try block fails then do this
finally:
    # always do this
```
In the following example, the try block fails because we cannot add integers to strings. In the except block val = 10 and then finally block prints complete.

```python
try:
    val = 1 + 'A'
except:
    val = 10
finally:
    print('complete')
    
print(val)
#=> complete
#=> 10
```

------
**Source**: https://python.ivan-shamaev.ru/python-job-interview-question-and-answer-for-data-scientist

**Translated by**: Georgii Kutivadze