Emitting log file causes Python kernel crashing #53

mailology · 2021-06-04T03:45:26Z

When testing the Python code on MNIST data with PAM algorithm, adding verbosity = 1 causes issue on the kernel. In particular, the following code causes kernel crashing.

X = pd.read_csv('data/MNIST-1k.csv', sep=' ', header=None).to_numpy()
X_tsne = TSNE(n_components = 2).fit_transform(X)

kmed = KMedoids(n_medoids = 10, algorithm = "naive", verbosity = 1)
kmed.fit(X, 'L2', 10, "naive_v1_mnist_log")

The above code runs properly if the verbosity = 1 is removed. If we change the algorithm to "BanditPAM", the verbosity = 1 does not cause any issue and the log file is generated properly.

The text was updated successfully, but these errors were encountered:

motiwari · 2021-07-03T15:56:07Z

Nice find, @mailology ! Think you can take a look?

mailology · 2021-07-08T16:32:54Z

There is a crash because we forgot to update the swap loss logHelper.loss_swap in the swap part of the naive algorithm. Also, we forgot to update the number of swaps indicated by the variable steps. I have fixed it and tried the same example as above:

It works now and emits the following log file:

Built:891,392,354,714,23,805,527,777,251,972
Swapped:694,168,306,714,324,959,527,800,251,737
Num Swaps: 10
Final Loss: 7.44375
Build Logstring:
		:compute_exactly
		:loss
		:p
		:sigma
Swap Logstring:
		:compute_exactly
		:loss
				0: 7.52346
				1: 7.50876
				2: 7.49285
				3: 7.48046
				4: 7.47164
				5: 7.46393
				6: 7.45825
				7: 7.44646
				8: 7.44375
				9: 7.44375
		:p
		:sigma

Since the build step is greedy and we go through all possible iterations, do we still want to give any loss information there?

motiwari · 2021-07-08T16:38:44Z

Love it!

And yes, we should fill all of those fields out, including for the build step, and the compute_exactly, p, sigma, etc. as in the prior logfiles: https://github.com/motiwari/BanditPAM-python/blob/master/profiles/MNIST_L2_k10_paper/L-ucb-True-BS-v-0-k-10-N-1000-s-42-d-MNIST-m-L2-w-

complete the logfile for naive algorithm. Fixes #53

motiwari assigned mailology Jul 3, 2021

motiwari added this to the 7-16-21 Milestone milestone Jul 9, 2021

mailology mentioned this issue Jul 11, 2021

complete the logfile for naive algorithm. Fixes #53 #71

Merged

mailology pushed a commit that referenced this issue Jul 14, 2021

introduce sigma_log function for sigma output. Fixes #53

edf7771

motiwari closed this as completed in 4f8a2d9 Jul 16, 2021

motiwari added a commit that referenced this issue Jul 16, 2021

Merge pull request #71 from ThrunGroup/naive_logfile

888693e

complete the logfile for naive algorithm. Fixes #53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emitting log file causes Python kernel crashing #53

Emitting log file causes Python kernel crashing #53

mailology commented Jun 4, 2021

motiwari commented Jul 3, 2021

mailology commented Jul 8, 2021

motiwari commented Jul 8, 2021

Emitting log file causes Python kernel crashing #53

Emitting log file causes Python kernel crashing #53

Comments

mailology commented Jun 4, 2021

motiwari commented Jul 3, 2021

mailology commented Jul 8, 2021

motiwari commented Jul 8, 2021