Hint mechanism different in the paper? #2

rcamino · 2018-08-06T08:49:33Z

In the paper the hint mechanism is generated by first selecting one feature by row in a vector called k.

Then a matrix b with the same size of the mask m is created, with a one per row according to k and the rest set to zero.

Then the hint is created with the equation:

h = b * m + 0.5 * (1.0 - b)

Which means that the hint is almost a copy of m but has exactly one 0.5 per row.

In your implementation, the hint is created by removing ones of the mask m with a probability of 0.9 (or a probability of keeping them of 0.1). There are no 0.5 values in the hint.

# 1. Mini batch size
mb_size = 128
# 2. Missing rate
p_miss = 0.5
# 3. Hint rate
p_hint = 0.9
# 4. Loss Hyperparameters
alpha = 10
# 5. Imput Dim (Fixed)
Dim = 784

def sample_M(m, n, p):
    A = np.random.uniform(0., 1., size = [m, n])
    B = A > p
    C = 1.*B
    return C

M_mb = sample_M(mb_size, Dim, p_miss)
H_mb1 = sample_M(mb_size, Dim, 1-p_hint)
H_mb = M_mb * H_mb1

Am I understanding something wrong?
Thank you.

The text was updated successfully, but these errors were encountered:

jsyoon0823 · 2018-08-06T09:12:06Z

In practice, providing 90% of the mask vector as the hints make the best performance. (Hint is only given to the known features)
In theory (in the paper), providing one feature as a hint converges to the optimal solution with MCAR setting.

rcamino · 2018-08-06T09:44:27Z

Thank you for the quick answer.

Besides the convergence question, I still have the doubt about the 0.5.

In the paper I understood that the hint was indicating:

1 -> known original value
0 -> known imputed value
0.5 -> unknown

And the discriminator has to define if the 0.5 is an original or an inputed value.

But in this implementation, the hint shows:

1 -> known original value
0 -> unknown

So the hint is only helping in the known original values, but giving no hint about the missing values?

jsyoon0823 · 2018-08-07T08:10:07Z

Yes.
In this code, the hint is only provided to the known variables.
Therefore, the discriminator has to determine if the 0 is an original or an imputed value.
We don't provide the imputed variables as the hint; therefore, we don't need to introduce 0.5 here.
Thanks.

ElApseR · 2019-01-01T10:34:06Z

I have tested two types of hint : original paper vs. this code
and I found out that the performance of these two models were almost the same.
Although the variance of MSE test loss designed by the original paper(using 0.5 for hint) was bit higher, it didn't seem that meaningful.

jsyoon0823 · 2019-01-01T23:27:11Z

Usually, on missing completely at random setting, hint does not have a big impact on the results.

guoliangxie123 · 2019-04-29T13:48:41Z

1.The Imputed Matrix is equal to the Hat_New_X?
2.when i try to print the Hat_New_X, I find that some 0 positions are not imputed ,Is it 0 in the original data?
look forward to your reply

jsyoon0823 · 2019-04-29T15:55:17Z

Yes. G_sample is the output of the generator and Hat_New_X is the matrix that only missing values are replaced by G_sample.
Yes. some of them have 0 as the original values.
Thanks!

guoliangxie123 · 2019-04-30T04:09:55Z

Thank you for the quick answer.

Is the letter data in your codes containing missing values? And has been filled with 0.
I can't compare the imputed data with the original dataset because there is no raw dataset
I recently wrote a paper to quote your paper to impute the data, but the effect is not ideal

jsyoon0823 · 2019-04-30T14:05:01Z

No. The letter data is complete data.

I introduce the missing in line 51-59 and 210.
Please check those lines.

The original raw data is always there that you can compare.

Please see line 233 and 186.

ainilaha · 2022-03-17T02:03:31Z

In the paper, Figure 1 shows, you feed three matrixes, including data matrix, random matrix, mask matrix, but I do not see you feeding random matrix to the generator. What is the random matrix?

jsyoon0823 · 2022-03-21T03:45:33Z

You can see how we use random matrix in this link (https://github.com/jsyoon0823/GAIN/blob/master/gain.py#L168-L169)

jsyoon0823 closed this as completed Mar 11, 2020

jsyoon0823 mentioned this issue Oct 25, 2022

Hint matrix #35

Closed

javiersgjavi mentioned this issue Mar 13, 2023

Why isn't the loss calculated only with b_i=0 values of the Hints. #36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hint mechanism different in the paper? #2

Hint mechanism different in the paper? #2

rcamino commented Aug 6, 2018 •

edited

Loading

jsyoon0823 commented Aug 6, 2018

rcamino commented Aug 6, 2018

jsyoon0823 commented Aug 7, 2018

ElApseR commented Jan 1, 2019

jsyoon0823 commented Jan 1, 2019

guoliangxie123 commented Apr 29, 2019

jsyoon0823 commented Apr 29, 2019

guoliangxie123 commented Apr 30, 2019

jsyoon0823 commented Apr 30, 2019

ainilaha commented Mar 17, 2022

jsyoon0823 commented Mar 21, 2022

Hint mechanism different in the paper? #2

Hint mechanism different in the paper? #2

Comments

rcamino commented Aug 6, 2018 • edited Loading

jsyoon0823 commented Aug 6, 2018

rcamino commented Aug 6, 2018

jsyoon0823 commented Aug 7, 2018

ElApseR commented Jan 1, 2019

jsyoon0823 commented Jan 1, 2019

guoliangxie123 commented Apr 29, 2019

jsyoon0823 commented Apr 29, 2019

guoliangxie123 commented Apr 30, 2019

jsyoon0823 commented Apr 30, 2019

ainilaha commented Mar 17, 2022

jsyoon0823 commented Mar 21, 2022

rcamino commented Aug 6, 2018 •

edited

Loading