---

### **Basics of Python & Libraries**
1. **What is data analysis?**  
   Data analysis involves inspecting, cleaning, and modeling data to discover useful information and support decision-making.

2. **Which Python libraries are commonly used for data analysis?**  
   Common libraries include pandas, NumPy, Matplotlib, Seaborn, and scikit-learn.

3. **What is pandas used for in Python?**  
   Pandas is used for data manipulation and analysis, especially with tabular data in DataFrames.

4. **What is NumPy?**  
   NumPy is a library for numerical computations and supports large multi-dimensional arrays and matrices.

5. **What is a DataFrame in pandas?**  
   A DataFrame is a 2D labeled data structure similar to a table in a database or Excel sheet.

6. **What is a Series in pandas?**  
   A Series is a one-dimensional labeled array capable of holding any data type.

7. **How do you import pandas in Python?**  
   By using `import pandas as pd`.

8. **How do you read a CSV file in pandas?**  
   Use `pd.read_csv('filename.csv')`.

9. **What is the purpose of the `head()` function in pandas?**  
   It returns the first five rows of the DataFrame by default.

10. **What does the `info()` function do in pandas?**  
    It gives a summary of the DataFrame including data types and non-null counts.

---

### **Data Cleaning & Preprocessing**
11. **What is data cleaning?**  
    Data cleaning involves handling missing values, correcting data types, and removing duplicates.

12. **How do you check for null values in pandas?**  
    Use `df.isnull()` to identify and `df.isnull().sum()` to count null values column-wise.

13. **How do you drop null values in pandas?**  
    Use `df.dropna()` to remove rows with null values.

14. **How do you fill missing values in pandas?**  
    Use `df.fillna(value)` to replace nulls with a specified value.

15. **What is data normalization?**  
    Normalization scales data to a fixed range, usually [0, 1], to prepare it for modeling.

16. **What is the use of `astype()` in pandas?**  
    It is used to change the data type of a column.

17. **How can you remove duplicate rows in pandas?**  
    Use `df.drop_duplicates()`.

18. **What is data encoding?**  
    Encoding converts categorical variables into numerical format using techniques like one-hot encoding.

19. **What is data transformation?**  
    It involves converting data into a suitable format for analysis, like applying log or square root.

20. **What is the purpose of the `apply()` function?**  
    It applies a function along the axis of a DataFrame (rows or columns).

---

### **Data Manipulation & Aggregation**
21. **How do you select a column in pandas?**  
    Use `df['column_name']` to select a column.

22. **How do you filter rows in pandas based on a condition?**  
    Use boolean indexing like `df[df['column'] > 10]`.

23. **What is grouping in pandas?**  
    Grouping splits data into groups for aggregation using `groupby()`.

24. **What does the `describe()` function do?**  
    It provides statistical summary like mean, count, std, min, max for numerical columns.

25. **How do you sort a DataFrame by a column?**  
    Use `df.sort_values('column_name')`.

26. **How do you rename a column in pandas?**  
    Use `df.rename(columns={'old':'new'}, inplace=True)`.

27. **What is indexing in pandas?**  
    Indexing selects specific rows and columns in a DataFrame.

28. **What is the difference between `loc[]` and `iloc[]`?**  
    `loc[]` is label-based; `iloc[]` is integer-position-based indexing.

29. **How do you merge two DataFrames in pandas?**  
    Use `pd.merge(df1, df2, on='column')`.

30. **How do you concatenate DataFrames?**  
    Use `pd.concat([df1, df2])`.

---

### **Data Visualization**
31. **Why is data visualization important?**  
    It helps in understanding patterns, trends, and insights visually.

32. **Which libraries are used for data visualization in Python?**  
    Common ones include Matplotlib and Seaborn.

33. **How do you plot a line chart using Matplotlib?**  
    Use `plt.plot(x, y)` and `plt.show()`.

34. **How do you create a bar chart in Seaborn?**  
    Use `sns.barplot(x='col1', y='col2', data=df)`.

35. **What is a histogram used for?**  
    It shows the distribution of a numeric variable.

36. **How do you show a plot in Python?**  
    Use `plt.show()` to display the plot.

37. **What is a scatter plot used for?**  
    It shows the relationship between two numerical variables.

38. **What is a heatmap?**  
    A heatmap shows correlations or values as color intensities.

39. **What is the use of `pairplot()` in Seaborn?**  
    It plots pairwise relationships in a dataset.

40. **How do you set the size of a plot in Matplotlib?**  
    Use `plt.figure(figsize=(width, height))`.

---

### **Statistical Analysis & ML Basics**
41. **What is correlation in data analysis?**  
    It measures the strength and direction of a linear relationship between variables.

42. **What does a correlation matrix show?**  
    It shows the pairwise correlation coefficients between variables.

43. **What is a regression analysis?**  
    It estimates the relationship between a dependent and one or more independent variables.

44. **What is linear regression?**  
    A method to model the relationship between a dependent variable and a linear predictor.

45. **What is logistic regression used for?**  
    It is used for binary classification problems.

46. **Which library is used for ML in Python?**  
    `scikit-learn` is commonly used for machine learning tasks.

47. **What is data splitting in ML?**  
    Dividing data into training and testing sets using `train_test_split`.

48. **How do you evaluate a regression model?**  
    Common metrics include R² score, MAE, MSE.

49. **What is overfitting in ML?**  
    Overfitting is when a model learns the training data too well and performs poorly on new data.

50. **What is feature scaling?**  
    It's the process of normalizing data to ensure all features contribute equally to the model.

---

### General Python and Data Analysis
1. **What is the purpose of the `eval()` function in Python as used in your programs?**  
   The `eval()` function evaluates a string as a Python expression, often used to convert user input into a data structure like a list.  
   In your programs, it’s used to parse input arrays, but it should be used cautiously due to security risks.

2. **How does NumPy improve data analysis compared to Python lists?**  
   NumPy provides efficient array operations, vectorized computations, and memory optimization for numerical data.  
   It’s used in your programs for array manipulation, statistical calculations, and plotting.

3. **What is the role of Pandas in data analysis?**  
   Pandas offers DataFrame and Series for structured data manipulation, cleaning, and analysis.  
   Your programs use Pandas for loading datasets, handling missing values, and grouping data.

4. **Why is Matplotlib used in your data visualization programs?**  
   Matplotlib creates customizable plots like scatter, bar, and line graphs for data visualization.  
   It’s used in your programs to visualize trends, such as runtime vs. popularity or regression models.

5. **What is Seaborn, and how does it differ from Matplotlib?**  
   Seaborn is a high-level library built on Matplotlib, offering aesthetically pleasing statistical plots.  
   Your Q11 program uses Seaborn for regression, box, and violin plots with the "tips" dataset.

---

### Linear Search and Array Manipulation (Q1, Q5, Q7)
6. **What is the time complexity of the linear search algorithm in Q1?**  
   The time complexity of linear search is O(n), where n is the array length.  
   It checks each element sequentially until the key is found or the array ends.

7. **How does the linear search program handle cases where the element is not found?**  
   The program prints “Element not found” for each position until the loop completes without finding the key.  
   A flag could improve efficiency by stopping after confirming the element’s absence.

8. **What does `np.random.randint()` do in Q7?**  
   It generates a random integer array within a specified range, used to create an m×n matrix.  
   In Q7, it populates the matrix with random values for visualization.

9. **What is the significance of `np.concat()` in Q5?**  
   `np.concat()` combines multiple arrays along a specified axis, used to merge two 2D arrays.  
   In Q5, it creates a single array for further reshaping and manipulation.

10. **How does `np.split()` work in Q5?**  
   `np.split()` divides an array into equal sub-arrays along a specified axis.  
   In Q5, it splits a reshaped array into three parts for demonstration.

---

### Insertion in Sorted List (Q2)
11. **What is the purpose of the insert function in Q2?**  
   The insert function adds an element into a sorted list while maintaining the sorted order.  
   It finds the correct index and concatenates slices of the list with the new element.

12. **What happens if the key in Q2 is larger than all elements in the list?**  
   The key is appended to the end of the list, as the index defaults to the list’s length.  
   This ensures the list remains sorted after insertion.

13. **Why is `sorted(set())` used in Q2’s input processing?**  
   It converts the input to a sorted list with unique elements, removing duplicates.  
   This ensures the list is ready for insertion without redundant values.

14. **What is the time complexity of the insertion algorithm in Q2?**  
   The time complexity is O(n) due to the linear search for the insertion point and list slicing.  
   List slicing in Python creates new lists, adding to the overhead.

15. **How could you optimize the insertion in Q2?**  
   Use binary search to find the insertion point, reducing the search time to O(log n).  
   However, list slicing would still contribute O(n) complexity.

---

### Object-Oriented Programming (Q3)
16. **What is encapsulation as demonstrated in Q3?**  
   Encapsulation restricts access to class attributes, using protected (e.g., `_c`) or private variables.  
   In Q3, `_c` in the Base class is accessed by the Derived class, showing controlled access.

17. **How is inheritance implemented in Q3?**  
   The Derived class inherits from the Base class, accessing its attributes via `super().__init__()`.  
   This allows Derived to reuse and extend Base’s functionality.

18. **What is operator overloading in Q3’s class B?**  
   Operator overloading defines custom behavior for operators, like `+`, using `__add__`.  
   In Q3, `B.__add__` adds an instance’s `_y` to another’s `_x`.

19. **Why is `_x` defined as a protected attribute in class A?**  
   The underscore prefix indicates `_x` is intended for internal use, discouraging direct access.  
   In Q3, it’s still accessible in class B, showing Python’s naming convention for protection.

20. **What does `super().__init__()` do in Q3?**  
   It calls the parent class’s `__init__` method to initialize inherited attributes.  
   In Q3, Derived uses it to set up Base’s `a` and `_c`.

---

### Data Cleaning and Manipulation (Q4, IMDB Analysis)
21. **Why is `df.drop_duplicates()` used in Q4?**  
   It removes duplicate rows to ensure data integrity and avoid skewed analysis.  
   In Q4, it eliminates repeated entries like Hank and Ivy.

22. **How does `df.fillna()` handle missing values in Q4?**  
   It replaces NaN values with specified values, like the mean for numeric columns.  
   In Q4, Age and Salary NaNs are filled with their respective means.

23. **What is the purpose of `np.mean()` in Q4?**  
   It calculates the average of a numeric column, used to fill missing values.  
   In Q4, it computes mean Age and Salary for imputation.

24. **Why is `df['Standardized_Age']` created in Q4?**  
   It standardizes Age by subtracting the mean and dividing by the standard deviation.  
   This normalizes the data for better comparison and analysis.

25. **How does the IMDB analysis handle missing budget values?**  
   Zero budget values are replaced with NaN to distinguish invalid entries.  
   This ensures accurate statistical analysis, as seen in the budget-popularity correlation.

---

### Data Visualization (Q6, Q11, IMDB Analysis)
26. **What is the advantage of using a scatter plot in Q6?**  
   Scatter plots visualize relationships between two continuous variables, like x vs. y.  
   In Q6, it shows the distribution of input arrays clearly.

27. **Why is `plt.figure(figsize=(6,4))` used in Q6?**  
   It sets the plot size for better readability and aesthetics.  
   In Q6, it ensures all graphs (line, scatter, bar, pie) are consistently sized.

28. **What does `sns.regplot()` do in Q11?**  
   It creates a scatter plot with a regression line to show the relationship between variables.  
   In Q11, it visualizes the correlation between total_bill and tip.

29. **How does the IMDB analysis use heatmaps for genre trends?**  
   Heatmaps show budget and revenue trends across genres and years, with color intensity indicating values.  
   They reveal that action/adventure genres gained traction post-2000.

30. **What is the purpose of `sns.pairplot()` in Q11?**  
   It generates pairwise scatter plots for multiple variables, colored by a categorical variable (e.g., sex).  
   In Q11, it explores relationships in the "tips" dataset across all numeric columns.

---

### Statistical Modeling (Q8, Q9)
31. **What does `LinearRegression().fit()` do in Q8?**  
   It trains a linear regression model by fitting the input features (x) to the target (y).  
   In Q8, it learns coefficients for predicting y from x.

32. **What is the significance of `model.coef_` in Q8?**  
   It represents the coefficients (slopes) of the linear regression model for each feature.  
   In Q8, it shows the weight of each input dimension.

33. **How does `LogisticRegression()` differ from `LinearRegression()` in Q9?**  
   Logistic regression predicts categorical outcomes (e.g., 0 or 1), while linear regression predicts continuous values.  
   In Q9, it classifies based on input features x.

34. **What does `model.predict()` return in Q9?**  
   It returns the predicted class labels for the input data based on the trained logistic model.  
   In Q9, it predicts whether new data belongs to class 0 or 1.

35. **Why is the intercept important in regression models (Q8, Q9)?**  
   The intercept is the predicted value when all features are zero, setting the baseline.  
   In Q8 and Q9, it adjusts the regression line or decision boundary.

---

### Time Series and Advanced Analysis (Q10, IMDB Analysis)
36. **What is `pd.date_range()` in Q10?**  
   It generates a sequence of dates at a specified frequency, used as a time series index.  
   In Q10, it creates daily dates for the DataFrame.

37. **How does `df.resample()` work in Q10?**  
   It aggregates time series data to a different frequency, like weekly averages.  
   In Q10, it resamples daily data to weekly means.

38. **What does the IMDB analysis reveal about runtime and popularity?**  
   Films with 100–200 minute runtimes have higher popularity, while over 200 minutes see a decline.  
   This is shown via bar charts and scatter plots.

39. **How is profit calculated in the IMDB analysis?**  
   Profit is calculated as revenue minus budget, using adjusted values for consistency.  
   It’s used to analyze the correlation with popularity.

40. **What does `df.nlargest()` do in the IMDB analysis?**  
   It returns the top n rows based on a specified column, like revenue.  
   It’s used to identify the top 10 revenue-generating movies.

---

### IMDB-Specific Analysis
41. **Why are budget and revenue adjusted for inflation in the IMDB dataset?**  
   Adjustment to 2010 dollars ensures fair comparison across years, accounting for inflation.  
   This improves the accuracy of budget-popularity and revenue analyses.

42. **What does the scatter plot of budget vs. popularity show in the IMDB analysis?**  
   It shows a weak linear correlation, indicating budget alone doesn’t drive popularity.  
   High-budget films have 50% higher average popularity than low-budget ones.

43. **How is the median used in the IMDB budget-popularity analysis?**  
   The median budget splits movies into low and high-budget groups for comparison.  
   This reveals high-budget films have higher average popularity.

44. **Why is the first genre extracted in the IMDB genre analysis?**  
   Extracting the first genre simplifies analysis by focusing on the primary genre.  
   It’s used to study genre trends over time via heatmaps.

45. **What do the heatmaps in the IMDB analysis indicate?**  
   They show budget and revenue trends, highlighting action/adventure’s rise post-2000.  
   Color intensity reflects higher values for specific genres and years.

---

### Miscellaneous
46. **What is the role of `np.linspace()` in Q5?**  
   It generates evenly spaced numbers over a specified range, used for plotting smooth curves.  
   In Q5, it creates x-values for the sine wave plot.

47. **Why is `cmap` used in Q7’s `plt.imshow()`?**  
   `cmap` specifies the color map for visualizing the matrix, like "viridis" for gradient colors.  
   It enhances the matrix’s visual interpretability.

48. **What is the significance of `vote_count` in the IMDB dataset?**  
   Vote count reflects audience engagement, used to assess popularity and film impact.  
   It’s analyzed alongside runtime and popularity in the IMDB study.

49. **How does the IMDB analysis handle multi-value columns like genres?**  
   Genres are split by the pipe (|) separator, and the first genre is often used for simplicity.  
   A custom function counts genre occurrences for trend analysis.

50. **What are the limitations of the IMDB analysis?**  
   Missing budget/revenue data and lack of currency standardization for international films skew results.  
   IMDb’s proprietary popularity metric also lacks transparency.

---

### **General Concepts**

1. **Q:** What is data analysis?  
   **A:** Data analysis is the process of inspecting, cleaning, transforming, and modeling data to discover useful information and insights.

2. **Q:** Why is Python widely used for data analysis?  
   **A:** Python is simple, has powerful libraries like pandas and NumPy, and supports data visualization and machine learning.

3. **Q:** Name three popular Python libraries used in data analysis.  
   **A:** Pandas, NumPy, and Matplotlib are commonly used for data handling, computation, and visualization.

4. **Q:** What is a DataFrame in pandas?  
   **A:** A DataFrame is a 2D labeled data structure, similar to an Excel sheet or SQL table.

5. **Q:** What is a Series in pandas?  
   **A:** A Series is a one-dimensional labeled array that can hold any data type.

---

### **NumPy**

6. **Q:** What is NumPy used for?  
   **A:** NumPy is used for numerical computing with support for multi-dimensional arrays and mathematical operations.

7. **Q:** How do you create a NumPy array?  
   **A:** Using `np.array([1, 2, 3])`, where `np` is the alias for the NumPy module.

8. **Q:** What is broadcasting in NumPy?  
   **A:** Broadcasting allows operations on arrays of different shapes by stretching the smaller shape.

9. **Q:** What does `np.zeros((2,3))` return?  
   **A:** It returns a 2×3 array filled with zeros.

10. **Q:** How can you reshape a NumPy array?  
   **A:** Using the `.reshape()` method like `array.reshape(2, 3)`.

---

### **Pandas**

11. **Q:** How do you read a CSV file using pandas?  
   **A:** Using `pd.read_csv('filename.csv')`.

12. **Q:** How do you display the first few rows of a DataFrame?  
   **A:** Use `df.head()` to view the top 5 rows by default.

13. **Q:** How do you get basic statistics of a DataFrame?  
   **A:** Use `df.describe()` to get count, mean, std, min, max, etc.

14. **Q:** How do you check for missing values in a DataFrame?  
   **A:** Use `df.isnull()` to find NaNs and `df.isnull().sum()` to count them.

15. **Q:** How do you drop rows with missing values?  
   **A:** Use `df.dropna()` to remove rows containing NaN.

---

### **Data Cleaning & Manipulation**

16. **Q:** What is data cleaning?  
   **A:** Data cleaning involves removing errors, handling missing values, and standardizing formats.

17. **Q:** How do you fill missing values with the mean in pandas?  
   **A:** Use `df.fillna(df.mean())`.

18. **Q:** How do you rename columns in pandas?  
   **A:** Use `df.rename(columns={'old': 'new'}, inplace=True)`.

19. **Q:** How do you sort a DataFrame by a column?  
   **A:** Use `df.sort_values('column_name')`.

20. **Q:** What is `groupby()` in pandas?  
   **A:** It groups rows based on column values and applies aggregation functions like sum, mean, etc.

---

### **Data Visualization**

21. **Q:** What is data visualization?  
   **A:** It is the graphical representation of data to understand trends and patterns.

22. **Q:** Name some popular Python libraries for visualization.  
   **A:** Matplotlib, Seaborn, and Plotly.

23. **Q:** How do you plot a line chart in Matplotlib?  
   **A:** Use `plt.plot(x, y)` and `plt.show()` to display it.

24. **Q:** How do you create a bar chart in Matplotlib?  
   **A:** Use `plt.bar(x, y)`.

25. **Q:** What does `plt.scatter(x, y)` do?  
   **A:** It creates a scatter plot to show relationships between two variables.

---

### **Statistical Analysis**

26. **Q:** What is correlation in data analysis?  
   **A:** Correlation measures the relationship between two variables, ranging from -1 to +1.

27. **Q:** How do you calculate correlation in pandas?  
   **A:** Use `df.corr()`.

28. **Q:** What is the use of `value_counts()` in pandas?  
   **A:** It counts unique values in a Series or DataFrame column.

29. **Q:** What is a pivot table?  
   **A:** A pivot table summarizes data by grouping and aggregating based on given columns.

30. **Q:** How do you apply a function to each row in a DataFrame?  
   **A:** Use `df.apply(func, axis=1)`.

---

### **Matplotlib & Seaborn**

31. **Q:** What is Matplotlib?  
   **A:** A 2D plotting library for creating static, animated, and interactive plots in Python.

32. **Q:** How do you label axes in Matplotlib?  
   **A:** Use `plt.xlabel('X')` and `plt.ylabel('Y')`.

33. **Q:** How do you add a title to a plot?  
   **A:** Use `plt.title('Title Here')`.

34. **Q:** What is Seaborn used for?  
   **A:** Seaborn simplifies statistical plotting and works well with pandas.

35. **Q:** How do you create a heatmap using Seaborn?  
   **A:** Use `sns.heatmap(data, annot=True)`.

---

### **Basic Machine Learning Concepts (if asked)**

36. **Q:** What is supervised learning?  
   **A:** It's a type of machine learning where the model learns from labeled data.

37. **Q:** What is linear regression?  
   **A:** It's a statistical method to model the relationship between a dependent and independent variable.

38. **Q:** What is logistic regression used for?  
   **A:** It's used for binary classification problems like spam detection.

39. **Q:** What is scikit-learn?  
   **A:** It's a machine learning library in Python used for classification, regression, and clustering.

40. **Q:** How do you import linear regression in sklearn?  
   **A:** `from sklearn.linear_model import LinearRegression`.

---

### **Additional Python Essentials**

41. **Q:** What is the difference between `loc[]` and `iloc[]` in pandas?  
   **A:** `loc[]` is label-based, while `iloc[]` is index-based selection.

42. **Q:** How do you concatenate DataFrames?  
   **A:** Use `pd.concat([df1, df2])`.

43. **Q:** What is the use of `merge()` in pandas?  
   **A:** It combines DataFrames using database-style joins.

44. **Q:** How can you reset the index of a DataFrame?  
   **A:** Use `df.reset_index(drop=True)`.

45. **Q:** What does `axis=0` and `axis=1` represent in pandas?  
   **A:** `axis=0` means rows; `axis=1` means columns.

---

### **Miscellaneous**

46. **Q:** What is exploratory data analysis (EDA)?  
   **A:** EDA is the initial step to analyze and summarize the main characteristics of data.

47. **Q:** Why is data preprocessing important?  
   **A:** It prepares raw data for analysis by removing noise and inconsistencies.

48. **Q:** What is outlier detection?  
   **A:** It is the process of identifying values that deviate significantly from the dataset.

49. **Q:** How do you find the shape of a DataFrame?  
   **A:** Use `df.shape`, which returns a tuple of (rows, columns).

50. **Q:** What does `df.info()` show?  
   **A:** It displays the structure, data types, and non-null values in the DataFrame.

---

### **Module 1: Python Basics**

1. **Q:** What is an interpreter in Python?  
   **A:** An interpreter executes Python code line by line, making debugging easier and reducing compile time.

2. **Q:** What are identifiers in Python?  
   **A:** Identifiers are the names used for variables, functions, classes, etc., defined by the user.

3. **Q:** What are keywords in Python?  
   **A:** Keywords are reserved words like `if`, `for`, `return` that have special meaning in Python.

4. **Q:** What are expressions and statements in Python?  
   **A:** Expressions return a value; statements perform an action like assignment or function calls.

5. **Q:** How do you declare variables in Python?  
   **A:** You assign a value to a name using `=`, e.g., `x = 10`.

6. **Q:** What are the basic data types in Python?  
   **A:** `int`, `float`, `str`, `bool`, `list`, `tuple`, `dict`, `set`.

7. **Q:** Explain operator precedence in Python.  
   **A:** It defines the order in which operations are performed, e.g., multiplication before addition.

8. **Q:** What is indentation in Python?  
   **A:** Indentation defines code blocks; incorrect indentation leads to syntax errors.

9. **Q:** How do you take input in Python?  
   **A:** Using the `input()` function which returns user input as a string.

10. **Q:** How do you print output in Python?  
   **A:** Use the `print()` function to display text or values.

11. **Q:** What is type conversion?  
   **A:** Converting one data type to another, e.g., `int("5")` converts string to integer.

12. **Q:** What does `type()` do in Python?  
   **A:** It returns the type of the variable passed to it, e.g., `type(10)` returns `<class 'int'>`.

13. **Q:** What does the `is` operator do?  
   **A:** It checks whether two variables refer to the same object in memory.

14. **Q:** How does the `if` statement work in Python?  
   **A:** It executes a block of code if the condition is true.

15. **Q:** What is the difference between `if`, `elif`, and `else`?  
   **A:** `if` checks the first condition, `elif` checks additional ones, and `else` runs when all fail.

16. **Q:** What is a nested `if` statement?  
   **A:** An `if` statement inside another `if` statement.

17. **Q:** What is the use of `while` loop in Python?  
   **A:** It runs a block repeatedly as long as a condition is true.

18. **Q:** When do you use a `for` loop?  
   **A:** When you need to iterate over a sequence like a list or string.

19. **Q:** What do `break` and `continue` do?  
   **A:** `break` exits the loop; `continue` skips the current iteration.

20. **Q:** What are built-in modules in Python?  
   **A:** Predefined modules like `math`, `sys`, `os` for various tasks.

---

### **Module 2: Collections & OOP**

21. **Q:** How do you define a string in Python?  
   **A:** Using single, double, or triple quotes like `'Hello'` or `"Hello"`.

22. **Q:** What is string slicing?  
   **A:** Accessing a substring using `string[start:end]`.

23. **Q:** What is the difference between lists and tuples?  
   **A:** Lists are mutable; tuples are immutable.

24. **Q:** How do you append an item to a list?  
   **A:** Use `list.append(item)`.

25. **Q:** What is a set in Python?  
   **A:** An unordered collection of unique items.

26. **Q:** How is a dictionary defined?  
   **A:** As key-value pairs, e.g., `{"name": "Alex", "age": 25}`.

27. **Q:** What are some list methods?  
   **A:** `append()`, `remove()`, `sort()`, `reverse()`.

28. **Q:** How do you read from a file in Python?  
   **A:** Use `open("filename.txt", "r")` and `read()` or `readlines()`.

29. **Q:** What is a constructor in Python?  
   **A:** The `__init__` method used to initialize class attributes.

30. **Q:** What is inheritance in Python?  
   **A:** A class can inherit attributes and methods from another class.

31. **Q:** What is method overloading?  
   **A:** Defining multiple methods with the same name but different arguments.

32. **Q:** What is the `self` keyword?  
   **A:** Refers to the current instance of the class.

33. **Q:** What is scope of a variable?  
   **A:** The region where the variable is accessible, e.g., local or global.

34. **Q:** What are default parameters in functions?  
   **A:** Parameters that take default values if no arguments are passed.

35. **Q:** What is `*args` and `**kwargs`?  
   **A:** `*args` accepts variable positional args, `**kwargs` for keyword args.

---

### **Module 3: Data Preprocessing & Wrangling**

36. **Q:** What is data wrangling?  
   **A:** The process of cleaning and transforming raw data into a usable format.

37. **Q:** How do you read a CSV file in Python?  
   **A:** Using `pandas.read_csv('file.csv')`.

38. **Q:** What is data normalization?  
   **A:** Scaling data into a standard range like 0 to 1.

39. **Q:** What does `dropna()` do in pandas?  
   **A:** It removes rows with missing (NaN) values.

40. **Q:** How do you combine two datasets in pandas?  
   **A:** Using `concat()`, `merge()`, or `join()`.

41. **Q:** What is reshaping in pandas?  
   **A:** Changing the layout of a DataFrame using functions like `pivot()` or `melt()`.

42. **Q:** What are regular expressions?  
   **A:** Patterns used for string matching and manipulation, using the `re` module.

---

### **Module 4: Web Scraping & NumPy**

43. **Q:** What is web scraping?  
   **A:** Extracting data from websites using code, often with libraries like `requests` and `BeautifulSoup`.

44. **Q:** What are CSS selectors?  
   **A:** They are patterns used to select HTML elements in web scraping.

45. **Q:** What is NumPy used for?  
   **A:** For numerical operations, arrays, and scientific computing.

46. **Q:** How do you create a NumPy array?  
   **A:** Use `np.array([1, 2, 3])`.

47. **Q:** What is array slicing in NumPy?  
   **A:** Accessing a range of values using `array[start:end]`.

---

### **Module 5: Data Visualization**

48. **Q:** What is Matplotlib used for?  
   **A:** For plotting 2D graphs like line charts, bar graphs, scatter plots, etc.

49. **Q:** How do you plot a graph in Matplotlib?  
   **A:** Use `plt.plot(x, y)` followed by `plt.show()`.

50. **Q:** What is Seaborn?  
   **A:** A high-level data visualization library built on Matplotlib with better aesthetics.

---


## Module-1: Python Basic Concepts and Programming
1. **What is the role of the Python interpreter?**  
   The Python interpreter executes Python code line-by-line, converting it into machine-readable instructions.  
   It supports interactive mode for testing and script mode for running programs.

2. **What are Python identifiers?**  
   Identifiers are names used to identify variables, functions, or classes, following rules like starting with a letter or underscore.  
   They must avoid reserved keywords and special characters.

3. **What are Python keywords?**  
   Keywords are reserved words like `if`, `for`, `while`, with special meanings in Python.  
   They cannot be used as identifiers or variable names.

4. **How do statements differ from expressions in Python?**  
   Statements are complete instructions (e.g., `if`, `for`), while expressions are code snippets that produce values (e.g., `2 + 3`).  
   Expressions can be part of statements but not vice versa.

5. **What is variable scope in Python?**  
   Variable scope defines where a variable is accessible, such as local (inside a function) or global (outside).  
   The `global` keyword allows modifying global variables inside functions.

6. **What are Python operators and their precedence?**  
   Operators perform operations like arithmetic (`+`, `-`), logical (`and`, `or`), etc., with precedence rules (e.g., `*` before `+`).  
   Parentheses can override default precedence for clarity.

7. **Why is indentation important in Python?**  
   Indentation defines code blocks (e.g., loops, functions) instead of braces, ensuring proper structure.  
   Inconsistent indentation causes syntax errors.

8. **How does the `input()` function work in Python?**  
   The `input()` function reads user input as a string from the console.  
   It can be converted to other types using functions like `int()` or `float()`.

9. **What is the `type()` function used for?**  
   The `type()` function returns the data type of a variable or value, like `int`, `str`, or `list`.  
   It’s useful for debugging or type checking.

10. **What is the `is` operator in Python?**  
   The `is` operator checks if two variables refer to the same object in memory, unlike `==` which checks value equality.  
   It’s often used with `None` or to compare object identity.

11. **How does the `if...elif...else` statement work?**  
   It evaluates multiple conditions sequentially, executing the block of the first true condition or the `else` block if none are true.  
   It’s used for multi-way decision-making.

12. **What is the difference between `break` and `continue` in loops?**  
   `break` exits the loop entirely, while `continue` skips the current iteration and proceeds to the next.  
   Both control loop execution flow.

13. **What are default parameters in Python functions?**  
   Default parameters provide default values for function arguments, used if no value is passed.  
   They are defined in the function signature, e.g., `def func(x=10)`.

14. **What are `*args` and `**kwargs` in Python?**  
   `*args` allows a function to accept variable positional arguments, while `**kwargs` accepts variable keyword arguments.  
   They enable flexible function definitions.

15. **How are command-line arguments handled in Python?**  
   The `sys.argv` list in the `sys` module captures command-line arguments passed to a script.  
   The first element (`sys.argv[0]`) is the script name.

## Module-2: Python Collection Objects, Classes
16. **How are strings created and stored in Python?**  
   Strings are created using quotes (`'hello'` or `"hello"`) and stored as immutable sequences of characters.  
   They support operations like slicing and concatenation.

17. **What is string slicing in Python?**  
   String slicing extracts a substring using indices, e.g., `string[start:end:step]`.  
   For example, `s = "hello"; s[1:4]` returns `"ell"`.

18. **What are common string methods in Python?**  
   Methods like `upper()`, `lower()`, `strip()`, and `split()` manipulate strings for case conversion, trimming, or growth.  
   They return new strings due to string immutability.

19. **How are lists created in Python?**  
   Lists are created using square brackets, e.g., `my_list = [1, 2, 3]`, and can store heterogeneous data.  
   They are mutable, supporting operations like append and remove.

20. **What is the difference between indexing and slicing in lists?**  
   Indexing accesses a single element by its position (e.g., `list[0]`), while slicing extracts a sublist (e.g., `list[1:3]`).  
   Both use zero-based indexing.

21. **What are built-in functions used on lists?**  
   Functions like `len()`, `max()`, `min()`, and `sum()` return the length, maximum, minimum, or sum of list elements.  
   They simplify list processing.

22. **How do sets differ from lists in Python?**  
   Sets are unordered, mutable collections of unique elements, created with `set()` or `{}`, unlike lists which allow duplicates.  
   They support operations like union and intersection.

23. **What are tuples, and why are they used?**  
   Tuples are immutable sequences created with parentheses, e.g., `(1, 2, 3)`.  
   They are used for fixed data, like function return values, due to their immutability.

24. **How do dictionaries store data in Python?**  
   Dictionaries store key-value pairs, created with `dict()` or `{}`, e.g., `{"name": "Alice", "age": 25}`.  
   They provide fast lookups using keys.

25. **What is inheritance in Python classes?**  
   Inheritance allows a class to inherit attributes and methods from a parent class using `class Child(Parent)`.  
   It promotes code reuse and extensibility.

26. **What is a constructor in Python classes?**  
   The `__init__` method is a constructor, initializing a class instance with attributes.  
   It’s called automatically when an object is created.

27. **How does method overloading work in Python?**  
   Python doesn’t support traditional method overloading but achieves it using default parameters or `*args`/`**kwargs`.  
   Functions can handle different argument types dynamically.

28. **How do you read a file in Python?**  
   Use the `open()` function with mode `"r"` and methods like `read()` or `readlines()` to read file content.  
   Always close the file using `close()` or use a `with` statement.

29. **How do you write to a file in Python?**  
   Use `open()` with mode `"w"` or `"a"` and `write()` to add content to a file.  
   The `with` statement ensures proper file handling and closure.

30. **What is the difference between `"w"` and `"a"` modes in file handling?**  
   `"w"` overwrites the file if it exists, while `"a"` appends to the existing content.  
   Both create a new file if it doesn’t exist.

## Module-3: Data Pre-processing and Data Wrangling
31. **What is data pre-processing in Python?**  
   Data pre-processing involves cleaning, transforming, and organizing raw data for analysis.  
   It includes handling missing values, normalizing data, and merging datasets.

32. **How do you load a CSV file using Pandas?**  
   Use `pd.read_csv("filename.csv")` to load a CSV file into a Pandas DataFrame.  
   It supports options like specifying delimiters or encoding.

33. **How do you access an SQL database in Python?**  
   Use libraries like `sqlite3` or `SQLAlchemy` to connect to a database and execute queries.  
   For example, `conn = sqlite3.connect("database.db")` establishes a connection.

34. **What is data normalization in Python?**  
   Normalization scales numeric data to a standard range, like 0 to 1, using techniques like Min-Max scaling.  
   It’s often done with NumPy or Scikit-learn’s `StandardScaler`.

35. **How do you handle missing values in a DataFrame?**  
   Use `df.fillna(value)` to replace NaN with a specific value or `df.dropna()` to remove rows with NaN.  
   For example, fill with mean: `df.fillna(df.mean())`.

36. **What is data merging in Pandas?**  
   Merging combines multiple DataFrames using `pd.merge()` based on common columns or indices.  
   It supports join types like inner, outer, left, or right.

37. **How does `df.pivot_table()` work in Pandas?**  
   It reshapes data by creating a table with aggregated values based on specified columns.  
   For example, it can average values by category and year.

38. **What are regular expressions in Python?**  
   Regular expressions (via the `re` module) match and manipulate string patterns, like finding emails or splitting text.  
   For example, `re.findall(r"\d+", text)` extracts numbers.

39. **How do you strip extraneous information from data?**  
   Use string methods like `strip()`, `replace()`, or regular expressions to remove unwanted characters.  
   For example, `df["column"].str.strip()` removes leading/trailing spaces.

40. **What is data transformation in Python?**  
   Data transformation modifies data, like scaling, encoding categories, or creating new columns.  
   For example, `df["new_col"] = df["old_col"] * 2` doubles values.

## Module-4: Web Scraping and Numerical Analysis
41. **What is web scraping in Python?**  
   Web scraping extracts data from websites using libraries like `BeautifulSoup` or `Scrapy`.  
   It involves fetching HTML and parsing elements like tags or classes.

42. **How do you fetch a web page in Python?**  
   Use the `requests` library to send an HTTP GET request, e.g., `requests.get("url")`.  
   The response’s `.text` attribute provides the HTML content.

43. **What are CSS selectors in web scraping?**  
   CSS selectors identify HTML elements by tag, class, or ID, used with libraries like `BeautifulSoup`.  
   For example, `soup.select(".class")` finds elements with a specific class.

44. **How do you submit a form programmatically in Python?**  
   Use `requests.post("url", data={"field": "value"})` to send form data to a server.  
   It simulates user input for dynamic web pages.

45. **What is NumPy used for in numerical analysis?**  
   NumPy provides efficient arrays and mathematical functions for numerical computations.  
   It supports operations like matrix multiplication, statistical analysis, and linear algebra.

## Module-5: Data Visualization with NumPy, Matplotlib, and Seaborn
46. **What is the role of Matplotlib in data visualization?**  
   Matplotlib creates customizable plots like line, scatter, bar, and histograms for data analysis.  
   It allows fine control over plot elements like titles and labels.

47. **How does Seaborn enhance Matplotlib?**  
   Seaborn provides high-level, aesthetically pleasing statistical plots built on Matplotlib.  
   It simplifies complex visualizations like heatmaps and violin plots.

48. **How do you plot a time series with Pandas?**  
   Use `df.plot()` with a DataFrame indexed by dates to visualize time series data.  
   For example, `df["value"].plot()` creates a line plot.

49. **What is a heatmap in Seaborn?**  
   A heatmap visualizes matrix data with color intensity, showing patterns like correlations.  
   For example, `sns.heatmap(df.corr())` displays a correlation matrix.

50. **How do you add text to a Matplotlib plot?**  
   Use `plt.text(x, y, "text")` to place text at specific coordinates on the plot.  
   Alternatively, `plt.annotate()` adds text with arrows for emphasis.
## Module-1: Python Basic Concepts and Programming
1. **What is the role of the Python interpreter?**  
   The Python interpreter executes Python code line-by-line, converting it into machine-readable instructions.  
   It supports interactive mode for testing and script mode for running programs.

2. **What are Python identifiers?**  
   Identifiers are names used to identify variables, functions, or classes, following rules like starting with a letter or underscore.  
   They must avoid reserved keywords and special characters.

3. **What are Python keywords?**  
   Keywords are reserved words like `if`, `for`, `while`, with special meanings in Python.  
   They cannot be used as identifiers or variable names.

4. **How do statements differ from expressions in Python?**  
   Statements are complete instructions (e.g., `if`, `for`), while expressions are code snippets that produce values (e.g., `2 + 3`).  
   Expressions can be part of statements but not vice versa.

5. **What is variable scope in Python?**  
   Variable scope defines where a variable is accessible, such as local (inside a function) or global (outside).  
   The `global` keyword allows modifying global variables inside functions.

6. **What are Python operators and their precedence?**  
   Operators perform operations like arithmetic (`+`, `-`), logical (`and`, `or`), etc., with precedence rules (e.g., `*` before `+`).  
   Parentheses can override default precedence for clarity.

7. **Why is indentation important in Python?**  
   Indentation defines code blocks (e.g., loops, functions) instead of braces, ensuring proper structure.  
   Inconsistent indentation causes syntax errors.

8. **How does the `input()` function work in Python?**  
   The `input()` function reads user input as a string from the console.  
   It can be converted to other types using functions like `int()` or `float()`.

9. **What is the `type()` function used for?**  
   The `type()` function returns the data type of a variable or value, like `int`, `str`, or `list`.  
   It’s useful for debugging or type checking.

10. **What is the `is` operator in Python?**  
   The `is` operator checks if two variables refer to the same object in memory, unlike `==` which checks value equality.  
   It’s often used with `None` or to compare object identity.

11. **How does the `if...elif...else` statement work?**  
   It evaluates multiple conditions sequentially, executing the block of the first true condition or the `else` block if none are true.  
   It’s used for multi-way decision-making.

12. **What is the difference between `break` and `continue` in loops?**  
   `break` exits the loop entirely, while `continue` skips the current iteration and proceeds to the next.  
   Both control loop execution flow.

13. **What are default parameters in Python functions?**  
   Default parameters provide default values for function arguments, used if no value is passed.  
   They are defined in the function signature, e.g., `def func(x=10)`.

14. **What are `*args` and `**kwargs` in Python?**  
   `*args` allows a function to accept variable positional arguments, while `**kwargs` accepts variable keyword arguments.  
   They enable flexible function definitions.

15. **How are command-line arguments handled in Python?**  
   The `sys.argv` list in the `sys` module captures command-line arguments passed to a script.  
   The first element (`sys.argv[0]`) is the script name.

## Module-2: Python Collection Objects, Classes
16. **How are strings created and stored in Python?**  
   Strings are created using quotes (`'hello'` or `"hello"`) and stored as immutable sequences of characters.  
   They support operations like slicing and concatenation.

17. **What is string slicing in Python?**  
   String slicing extracts a substring using indices, e.g., `string[start:end:step]`.  
   For example, `s = "hello"; s[1:4]` returns `"ell"`.

18. **What are common string methods in Python?**  
   Methods like `upper()`, `lower()`, `strip()`, and `split()` manipulate strings for case conversion, trimming, or growth.  
   They return new strings due to string immutability.

19. **How are lists created in Python?**  
   Lists are created using square brackets, e.g., `my_list = [1, 2, 3]`, and can store heterogeneous data.  
   They are mutable, supporting operations like append and remove.

20. **What is the difference between indexing and slicing in lists?**  
   Indexing accesses a single element by its position (e.g., `list[0]`), while slicing extracts a sublist (e.g., `list[1:3]`).  
   Both use zero-based indexing.

21. **What are built-in functions used on lists?**  
   Functions like `len()`, `max()`, `min()`, and `sum()` return the length, maximum, minimum, or sum of list elements.  
   They simplify list processing.

22. **How do sets differ from lists in Python?**  
   Sets are unordered, mutable collections of unique elements, created with `set()` or `{}`, unlike lists which allow duplicates.  
   They support operations like union and intersection.

23. **What are tuples, and why are they used?**  
   Tuples are immutable sequences created with parentheses, e.g., `(1, 2, 3)`.  
   They are used for fixed data, like function return values, due to their immutability.

24. **How do dictionaries store data in Python?**  
   Dictionaries store key-value pairs, created with `dict()` or `{}`, e.g., `{"name": "Alice", "age": 25}`.  
   They provide fast lookups using keys.

25. **What is inheritance in Python classes?**  
   Inheritance allows a class to inherit attributes and methods from a parent class using `class Child(Parent)`.  
   It promotes code reuse and extensibility.

26. **What is a constructor in Python classes?**  
   The `__init__` method is a constructor, initializing a class instance with attributes.  
   It’s called automatically when an object is created.

27. **How does method overloading work in Python?**  
   Python doesn’t support traditional method overloading but achieves it using default parameters or `*args`/`**kwargs`.  
   Functions can handle different argument types dynamically.

28. **How do you read a file in Python?**  
   Use the `open()` function with mode `"r"` and methods like `read()` or `readlines()` to read file content.  
   Always close the file using `close()` or use a `with` statement.

29. **How do you write to a file in Python?**  
   Use `open()` with mode `"w"` or `"a"` and `write()` to add content to a file.  
   The `with` statement ensures proper file handling and closure.

30. **What is the difference between `"w"` and `"a"` modes in file handling?**  
   `"w"` overwrites the file if it exists, while `"a"` appends to the existing content.  
   Both create a new file if it doesn’t exist.

## Module-3: Data Pre-processing and Data Wrangling
31. **What is data pre-processing in Python?**  
   Data pre-processing involves cleaning, transforming, and organizing raw data for analysis.  
   It includes handling missing values, normalizing data, and merging datasets.

32. **How do you load a CSV file using Pandas?**  
   Use `pd.read_csv("filename.csv")` to load a CSV file into a Pandas DataFrame.  
   It supports options like specifying delimiters or encoding.

33. **How do you access an SQL database in Python?**  
   Use libraries like `sqlite3` or `SQLAlchemy` to connect to a database and execute queries.  
   For example, `conn = sqlite3.connect("database.db")` establishes a connection.

34. **What is data normalization in Python?**  
   Normalization scales numeric data to a standard range, like 0 to 1, using techniques like Min-Max scaling.  
   It’s often done with NumPy or Scikit-learn’s `StandardScaler`.

35. **How do you handle missing values in a DataFrame?**  
   Use `df.fillna(value)` to replace NaN with a specific value or `df.dropna()` to remove rows with NaN.  
   For example, fill with mean: `df.fillna(df.mean())`.

36. **What is data merging in Pandas?**  
   Merging combines multiple DataFrames using `pd.merge()` based on common columns or indices.  
   It supports join types like inner, outer, left, or right.

37. **How does `df.pivot_table()` work in Pandas?**  
   It reshapes data by creating a table with aggregated values based on specified columns.  
   For example, it can average values by category and year.

38. **What are regular expressions in Python?**  
   Regular expressions (via the `re` module) match and manipulate string patterns, like finding emails or splitting text.  
   For example, `re.findall(r"\d+", text)` extracts numbers.

39. **How do you strip extraneous information from data?**  
   Use string methods like `strip()`, `replace()`, or regular expressions to remove unwanted characters.  
   For example, `df["column"].str.strip()` removes leading/trailing spaces.

40. **What is data transformation in Python?**  
   Data transformation modifies data, like scaling, encoding categories, or creating new columns.  
   For example, `df["new_col"] = df["old_col"] * 2` doubles values.

## Module-4: Web Scraping and Numerical Analysis
41. **What is web scraping in Python?**  
   Web scraping extracts data from websites using libraries like `BeautifulSoup` or `Scrapy`.  
   It involves fetching HTML and parsing elements like tags or classes.

42. **How do you fetch a web page in Python?**  
   Use the `requests` library to send an HTTP GET request, e.g., `requests.get("url")`.  
   The response’s `.text` attribute provides the HTML content.

43. **What are CSS selectors in web scraping?**  
   CSS selectors identify HTML elements by tag, class, or ID, used with libraries like `BeautifulSoup`.  
   For example, `soup.select(".class")` finds elements with a specific class.

44. **How do you submit a form programmatically in Python?**  
   Use `requests.post("url", data={"field": "value"})` to send form data to a server.  
   It simulates user input for dynamic web pages.

45. **What is NumPy used for in numerical analysis?**  
   NumPy provides efficient arrays and mathematical functions for numerical computations.  
   It supports operations like matrix multiplication, statistical analysis, and linear algebra.

## Module-5: Data Visualization with NumPy, Matplotlib, and Seaborn
46. **What is the role of Matplotlib in data visualization?**  
   Matplotlib creates customizable plots like line, scatter, bar, and histograms for data analysis.  
   It allows fine control over plot elements like titles and labels.

47. **How does Seaborn enhance Matplotlib?**  
   Seaborn provides high-level, aesthetically pleasing statistical plots built on Matplotlib.  
   It simplifies complex visualizations like heatmaps and violin plots.

48. **How do you plot a time series with Pandas?**  
   Use `df.plot()` with a DataFrame indexed by dates to visualize time series data.  
   For example, `df["value"].plot()` creates a line plot.

49. **What is a heatmap in Seaborn?**  
   A heatmap visualizes matrix data with color intensity, showing patterns like correlations.  
   For example, `sns.heatmap(df.corr())` displays a correlation matrix.

50. **How do you add text to a Matplotlib plot?**  
   Use `plt.text(x, y, "text")` to place text at specific coordinates on the plot.  
   Alternatively, `plt.annotate()` adds text with arrows for emphasis.