In [None]:
### **Basic Python Questions**
'''
Q1) What is the difference between `list`, `tuple`, and `set` in Python?**
A) A list is a mutable, ordered collection. 
   A tuple is an immutable, ordered collection. 
   A set is an unordered collection of unique elements.
   
#### Q2) How does Python handle memory management?
A)  Python uses an automatic memory management system that includes a private heap containing all Python objects and data structures. 
The memory manager handles allocation and deallocation of memory, and the garbage collector reclaims memory by deleting unused objects.
   
#####Q4)  How do you handle exceptions in Python?
Ans) Exceptions in Python are handled using `try`, `except`, `else`, and `finally` blocks. Example:
   try:
       result = 10 / 0
   except ZeroDivisionError:
       print("You can't divide by zero!")
       
#####5)   What is the difference between `deepcopy` and `shallow copy`?
Ans) A shallow copy creates a new object but inserts references into it to the objects found in the original. 
     A deep copy creates a new object and recursively copies all objects found in the original.

'''

In [None]:
############# Basic Data Structure ############
### 1. What are Python lists and how are they different from tuples?
'''
**Python Lists:**
- Lists are mutable (i.e., their elements can be changed).
- Lists are defined using square brackets `[]`.
- Example: `my_list = [1, 2, 3, 4]`
  
**Tuples:**
- Tuples are immutable (i.e., once created, their elements cannot be changed).
- Tuples are defined using parentheses `()`.
- Example: `my_tuple = (1, 2, 3, 4)`

**Difference:**
- The key difference between lists and tuples is that lists are mutable, while tuples are immutable. 
This means that tuples are faster and more memory-efficient than lists but cannot be altered after creation.
'''

####### 2. How do you create a dictionary in Python and access its values?
'''
**Creating a Dictionary:**
```python
my_dict = {"name": "Alice", "age": 25, "city": "New York"}
```

**Accessing Values:**
- By Key: `my_dict["name"]` will return `"Alice"`
- Using the `get()` method: `my_dict.get("age")` will return `25`
  
**Example:**
```python
print(my_dict["city"])  # Output: New York
print(my_dict.get("age"))  # Output: 25
```
'''

### 3. Explain list comprehension and provide an example.
'''
**List Comprehension:**
- A concise way to create lists in Python.
- Syntax: `[expression for item in iterable if condition]`
  
**Example:**
```python
# Create a list of squares of even numbers from 0 to 10
squares = [x**2 for x in range(11) if x % 2 == 0]
print(squares)  # Output: [0, 4, 16, 36, 64, 100]
```
'''


In [None]:
####################### Data Frame/ PANDAS ##################
### 4. How can you read a CSV file in Python using pandas?
'''
**Reading a CSV File:**
```python
import pandas as pd

df = pd.read_csv("file.csv")
```
- This code reads a CSV file named `"file.csv"` and stores it in a pandas DataFrame `df`.
'''

### 5. What is the difference between loc and iloc in pandas?
'''
- **`loc`:** Accesses a group of rows and columns by labels or boolean arrays.
- **`iloc`:** Accesses a group of rows and columns by integer indices.

**Example:**
```python
# Using loc
df.loc[2, 'column_name']  # Accesses the value in the row with index label 2 and the specified column

# Using iloc
df.iloc[2, 3]  # Accesses the value in the third row and fourth column (0-based index)
``
'''

### 6. How do you handle missing data in a pandas DataFrame?
'''
**Handling Missing Data:**
- **Drop missing data:** `df.dropna()` removes rows with missing values.
- **Fill missing data:** `df.fillna(value)` fills missing values with the specified `value`.
- **Check for missing data:** `df.isnull()` returns a DataFrame indicating the presence of missing values.

**Example:**
```python
df.dropna()  # Removes all rows with any missing values
df.fillna(0)  # Replaces all NaN values with 0
```
'''

### 7. Explain the use of the apply() function in pandas.
'''
**`apply()` Function:**
- Used to apply a function along an axis (rows or columns) of a DataFrame.
- Can be used for custom operations on DataFrame elements.

**Example:**
```python
# Apply a function to double the values in a column
df['column_name'] = df['column_name'].apply(lambda x: x * 2)
```
'''

### 8. How can you merge/join two DataFrames in pandas?
'''
**Merging/Joining DataFrames:**
- **`merge()` function:** Allows combining DataFrames based on a key column(s).
- **`join()` function:** Combines DataFrames based on their index.

**Example:**
```python
# Merge on a common column
df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['A', 'B', 'D'], 'value': [4, 5, 6]})
merged_df = pd.merge(df1, df2, on='key')
```
'''

### 9. Describe how to group data in pandas and perform aggregation.
'''
**Grouping and Aggregation:**
- **`groupby()` function:** Groups data based on a column or columns.
- **Aggregation:** Applying functions like `sum()`, `mean()`, etc., to grouped data.

**Example:**
```python
grouped = df.groupby('category')['sales'].sum()
print(grouped)
```
- This example groups data by the `"category"` column and sums the `"sales"`.
'''


### 10. What is a lambda function in Python and how is it used?
'''
**Lambda Function:**
- A small anonymous function defined with the `lambda` keyword.
- Used for simple operations and often as arguments to higher-order functions.

**Example:**
```python
add = lambda x, y: x + y
print(add(5, 3))  # Output: 8
```
'''

### 11. Describe how to filter a DataFrame based on a condition.
'''
**Filtering a DataFrame:**
- Use boolean indexing to filter rows based on a condition.

**Example:**
```python
filtered_df = df[df['age'] > 30]
```
- This filters the DataFrame to include only rows where the `"age"` column is greater than 30.

'''



In [None]:

########### NUMPY #########
### 10. What are NumPy arrays and how do they differ from Python lists?
'''
**NumPy Arrays:**
- Arrays are homogeneous (all elements are of the same type).
- Support element-wise operations and are more efficient for numerical computations.
- Created using the NumPy library: `import numpy as np`

**Difference from Lists:**
- Lists can contain mixed data types, while NumPy arrays are homogeneous.
- NumPy arrays offer better performance and functionality for numerical operations.

**Example:**
```python
import numpy as np

arr = np.array([1, 2, 3, 4])
```
'''

### 11. How do you perform element-wise operations on NumPy arrays?
'''
**Element-Wise Operations:**
- Operations like addition, subtraction, multiplication, etc., are applied element-wise.

**Example:**
```python
import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

result = arr1 + arr2  # Output: array([5, 7, 9])
```
'''

In [None]:

######### datetime module 
### 17. How do you use the datetime module to manipulate dates and times in Python?
'''
**Using the `datetime` Module:**
- The `datetime` module provides classes for manipulating dates and times.
'''
#**Example:**
from datetime import datetime

# Current date and time
now = datetime.now()

# Formatting date
formatted_date = now.strftime("%Y-%m-%d %H:%M:%S")
print(formatted_date)  # Output: 2024-08-29 10:00:00

# Parsing a date string
date_str = "2024-08-29"
date_obj = datetime.strptime(date_str, "%Y-%m-%d")
print(date_obj)  # Output: 2024-08-29 00:00:00


In [None]:

######## Regular Expression 
#20. Describe how to use regular expressions in Python for data cleaning.
'''
- **Regular Expressions (Regex)**: Regex is a powerful tool for pattern matching in text. It is commonly used for searching, replacing, 
and parsing text in data cleaning tasks.

- **Example**:
  - Suppose you have a dataset with phone numbers in different formats, and you want to standardize them.
''' 
import re

# Sample data
phone_numbers = ["(123) 456-7890", "123-456-7890", "+1 123 456 7890", "123.456.7890"]

# Regular expression pattern to match various phone number formats
pattern = r'\D'  # Matches any character that is not a digit

# Standardize phone numbers to a uniform format: '1234567890'
cleaned_numbers = [re.sub(pattern, '', number) for number in phone_numbers]

print(cleaned_numbers)
# Output: ['1234567890', '1234567890', '11234567890', '1234567890']

'''
- **Common uses of Regex in Data Cleaning**:
  - **Removing unwanted characters**: E.g., stripping out punctuation or special characters.
  - **Extracting specific data patterns**: E.g., extracting email addresses, phone numbers, or dates.
  - **Validating formats**: E.g., ensuring that strings follow a specific format like a valid email address.

Regular expressions are an essential tool in data preprocessing, especially when working with unstructured text data.
'''

In [None]:
#18. Explain the difference between a shallow copy and a deep copy in Python.
'''
- **Shallow Copy**:
  - A shallow copy creates a new object, but it copies the references of the original objects contained within. 
  This means that changes made to the nested objects in the copied object will reflect in the original object because both references 
  point to the same memory location.
  - Example:
    ```python
    import copy

    original = [[1, 2, 3], [4, 5, 6]]
    shallow_copy = copy.copy(original)
    
    shallow_copy[0][0] = 10  # This will also modify the original list.
    
    print(original)       # Output: [[10, 2, 3], [4, 5, 6]]
    print(shallow_copy)   # Output: [[10, 2, 3], [4, 5, 6]]
    ```

- **Deep Copy**:
  - A deep copy creates a new object and recursively copies all objects found within the original object, creating a completely independent object. Changes made to the nested objects in the copied object do not affect the original object.
  - Example:
    ```python
    deep_copy = copy.deepcopy(original)
    
    deep_copy[0][0] = 20  # This will not modify the original list.
    
    print(original)     # Output: [[10, 2, 3], [4, 5, 6]]
    print(deep_copy)    # Output: [[20, 2, 3], [4, 5, 6]]
    ```
'''

#19. How can you perform data normalization or standardization in Python?
'''
- **Normalization**: This process rescales the features of your data to be in a range of `[0, 1]`.
  - Example using `MinMaxScaler` from `sklearn`:
    ```python
    from sklearn.preprocessing import MinMaxScaler
    import numpy as np

    data = np.array([[2, 3], [4, 6], [10, 15]])
    scaler = MinMaxScaler()
    normalized_data = scaler.fit_transform(data)

    print(normalized_data)
    # Output: [[0.   0.   ]
    #          [0.25  0.25]
    #          [1.   1.   ]]
    ```

- **Standardization**: This process scales your data such that it has a mean of `0` and a standard deviation of `1`.
  - Example using `StandardScaler` from `sklearn`:
    ```python
    from sklearn.preprocessing import StandardScaler
    import numpy as np

    data = np.array([[2, 3], [4, 6], [10, 15]])
    scaler = StandardScaler()
    standardized_data = scaler.fit_transform(data)

    print(standardized_data)
    # Output: [[-0.92 -0.94]
    #          [-0.52 -0.54]
    #          [ 1.44  1.48]]
    ```

'''

In [None]:

### 12. What is the use of the Matplotlib library in Python? Provide an example of a simple plot.
'''
**Matplotlib Library:**
- Used for creating static, animated, and interactive visualizations in Python.

**Example of a Simple Plot:**
```python
import matplotlib.pyplot as plt

x = [1, 2, 3, 4]
y = [10, 20, 25, 30]

plt.plot(x, y)
plt.xlabel('x-axis')
plt.ylabel('y-axis')
plt.title('Simple Line Plot')
plt.show()
```
'''

### 13. How do you create subplots in Matplotlib?
'''
**Creating Subplots:**
- Use `plt.subplot()` to create multiple plots within a single figure.

**Example:**
```python
import matplotlib.pyplot as plt

plt.subplot(1, 2, 1)
plt.plot([1, 2, 3], [4, 5, 6])

plt.subplot(1, 2, 2)
plt.plot([1, 2, 3], [6, 5, 4])

plt.show()
```
- This code creates a figure with two subplots arranged in one row and two columns.
'''

### 14. Explain the use of the Seaborn library and provide an example of a categorical plot.
'''
**Seaborn Library:**
- A data visualization library based on Matplotlib that provides a high-level interface for drawing attractive and informative statistical 
graphics.

**Example of a Categorical Plot:**
```python
import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")
sns.barplot(x="day", y="total_bill", data=tips)
plt.show()
```
- This code creates a bar plot showing the average total bill for each day of the week.
'''