-
Notifications
You must be signed in to change notification settings - Fork 2
/
info.json
22 lines (22 loc) · 1.32 KB
/
info.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
"abstract": "In this work, we provide a characterization of the feature-learning process in two-layer ReLU networks trained by gradient descent on the logistic loss following random initialization. We consider data with binary labels that are generated by an XOR-like function of the input features. We permit a constant fraction of the training labels to be corrupted by an adversary. We show that, although linear classifiers are no better than random guessing for the distribution we consider, two-layer ReLU networks trained by gradient descent achieve generalization error close to the label noise rate. We develop a novel proof technique that shows that at initialization, the vast majority of neurons function as random features that are only weakly correlated with useful features, and the gradient descent dynamics `amplify\u2019 these weak, random features to strong, useful features.",
"authors": [
"Spencer Frei",
"Niladri S. Chatterji",
"Peter L. Bartlett"
],
"emails": [
"frei@berkeley.edu",
"niladri@cs.stanford.edu",
"peter@berkeley.edu"
],
"id": "22-1132",
"issue": 303,
"pages": [
1,
49
],
"title": "Random Feature Amplification: Feature Learning and Generalization in Neural Networks",
"volume": 24,
"year": 2023
}