Skip to content

Machine Learning 코테 준비 #1

@Yeazzing

Description

@Yeazzing

Module 1: Machine Learning Fundamentals 예상 문제

Q1. Multiple Choice: Overfitting
Which of the following scenarios is most likely to lead to overfitting?
A) Increasing the amount of training data while keeping model complexity constant.
B) Using a model with high capacity relative to a small number of training samples.
C) Applying Early Stopping during the training process of a neural network.
D) Implementing data augmentation to introduce more variety in the training set.
-> B

Q2. Fill-in-the-blank: KNN Parameter
In the k-Nearest Neighbors (KNN) algorithm, choosing a ( ① ) value for $k$ can make the model sensitive to noise and lead to overfitting, while a ( ② ) value for $k$ might result in a decision boundary that is too smooth, potentially leading to underfitting.
-> Small, Large

Q3. Multiple Choice: Regularization
Which of the following statements about L1 and L2 Regularization is FALSE?
A) L1 Regularization adds the sum of the absolute values of the weights to the loss function.
B) L2 Regularization adds the sum of the squared values of the weights to the loss function.
C) L1 Regularization is known for its ability to perform feature selection by shrinking some weights to exactly zero.
D) L2 Regularization is more effective than L1 at producing sparse models where many features are ignored.
-> D

Q4. Fill-in-the-blank: Ensemble Methods
Among popular ensemble methods, ( ① ) is a representative of Bagging that trains multiple trees independently, whereas ( ② ) is a Boosting technique that builds trees sequentially to correct the errors of previous trees.
-> Random Forest, Gradient Boosting Machine

Q5. Multiple Choice: Bayes Rule
What is a major limitation or assumption when applying Bayes Rule in practical machine learning models (e.g., Naive Bayes)?
A) It assumes all features are independent of each other, which may not hold true in complex real-world data.
B) It requires a perfectly balanced dataset with an equal number of samples for each class to function.
C) The computational complexity increases exponentially as the number of data dimensions grows.
D) It is theoretically impossible to create non-linear decision boundaries using this approach.
-> A

Q6. Short Answer: Neural Networks
What is the term for the process in a neural network where input data is multiplied by weights, summed with a bias, and passed through an activation function to produce an output for the next layer?
-> Forward Propagation


Q1. Multiple Choice: Bias-Variance Tradeoff
A model that is too simple and fails to capture the underlying patterns of the data is said to have High Bias. Conversely, a model that is overly complex and sensitive to small fluctuations in the training set is said to have High Variance. Which of the following is a typical result of a High Variance model?
A) The model performs poorly on both training and test data.
B) The model has a very low error rate on training data but a high error rate on test data.
C) The model's decision boundary is a straight line, regardless of the data distribution.
D) The model is consistently underfitting the data.
-> B

Q2. Fill-in-the-blank: KNN Distance Metric
While Euclidean Distance is the most common metric for KNN, it can be problematic when features have different scales. To ensure that all features contribute equally to the distance calculation, it is crucial to perform ( ① ) before applying the KNN algorithm.
-> Normalization

Q3. Multiple Choice: Random Forest vs. GBM
Which of the following describes a key advantage of Random Forest over a single Decision Tree?
A) It reduces variance by averaging the predictions of multiple deep trees grown independently.
B) It reduces bias by sequentially focusing on the samples that were misclassified by previous trees.
C) It always produces a simpler and more interpretable model than a single tree.
D) It requires much less memory and computational power than a single tree.
-> A

Q4. Fill-in-the-blank: Regularization Effect
In ( ① ) Regularization (Ridge), the penalty term is proportional to the square of the magnitude of coefficients. This tends to shrink all coefficients towards zero but rarely makes them exactly zero, resulting in a model where ( ② ) features are retained but with smaller weights.
-> L2, All

Q5. Multiple Choice: Neural Network Activation
Which activation function is most commonly used in the hidden layers of modern deep learning models to mitigate the vanishing gradient problem?
A) Sigmoid
B) Tanh
C) ReLU (Rectified Linear Unit)
D) Step Function
-> C

Q6. Short Answer: Model Evaluation
In a binary classification task where the classes are highly imbalanced (e.g., fraud detection), which metric is generally more informative than simple Accuracy?
-> F1-Score

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions