Batch Selection for Multi-Label Classification: Guided by Dynamic Uncertainty and Label Correlations
python==3.7 skmultilearn==1.2.1 pytorch==1.11.0 numpy== 1.21.5
In detail, we can train on different datasets for the last cell:
path_to_arff_files = ["emotions","scene","yeast", "Corel5k","rcv1subset1","rcv1subset2","rcv1subset3","yahoo-Business1","yahoo-Arts1","bibtex",'tmc2007','enron','cal500','LLOG-F']
label_counts = [6, 6,14,374,101,101,101,28,25,159,22,53,174,75]
select_feature=[1,1,1,1,0.02,0.02,0.02,0.05,0.05,1,0.01,1,1,1]
y_pred_numpy = y_pred.detach().cpu().numpy()
for i, idx in enumerate(ids):
if idx not in H:
H[idx] = deque(maxlen=max_history_length)
H[idx].append(y_pred_numpy[i])
return H
max_history_length is the size of the sliding window, H is a queue used to store the latest sliding window size predictions
for idx in ids:
current_predictions_history = np.array(H[idx])
last_row_index = len(current_predictions_history) - 1
for j in range(label_dim):
diffs = np.abs(np.diff(current_predictions_history[:, j]))
mean_diffs = np.sum(diffs)/len(diffs)
current_entropy = -1 / np.log(2) * (
current_predictions_history[last_row_index][j] * np.log(current_predictions_history[last_row_index][j])
+ (1 - current_predictions_history[last_row_index][j]) * np.log(1 - current_predictions_history[last_row_index][j])
)
E[idx,j] = 1/2 * mean_diffs + 1/2 * current_entropy
return E
E is a two-dimensional array of n (number of samples) * q (number of labels), which updates the uncertainty of each sample and label in current epoch.
E[E > 1] = 1
if epoch >= 5:
bins = np.array([0, 0.2, 0.4, 0.6, 0.8, 1.0])
discrete_values = np.array([0.1, 0.3, 0.5, 0.7, 0.9])
indices = np.digitize(E, bins) - 1
indices = np.clip(indices, 0, len(discrete_values) - 1)
E = discrete_values[indices]
else:
I = np.ones((E.shape[1], E.shape[1]))
I = np.ones((E.shape[1], E.shape[1]))
U = np.dot(E, I)
w = np.sum(U, axis=1)
return w
Update the selection weight w through E
analyse.ipynb includes all comparison methods,You can run the corresponding cell to obtain the results
We are pleased that this paper has been accepted by AAAI2025 and will update it after its official publication
If you have any questions or suggestions, feel free to contact me: zacqupt@gmail.com.