KL Loss is right? #6

BridgetteSong · 2021-06-15T12:05:26Z

When I searched KL-divergence between two Gaussians, I got this which is diffenrent from your KL loss
https://stats.stackexchange.com/questions/7440/kl-divergence-between-two-univariate-gaussians

jaywalnut310 · 2021-06-15T23:42:05Z

Hi @BridgetteSong. Yes, (closed-form) KL-divergence between two Gaussians is different from our KL loss. It's because we use KL-divergence between a Gaussian and a Normalizing flow rather than two Gaussians. Therefore, there is no closed-form KL like Gaussians. Equation 4 of our paper shows that the prior distribution is not Gaussian.

If you're not familiar with normalizing flows, and if you don't know how to calculate their log-likelihood (which is needed for calculating KL), it would be better to look these blog posts first: nf1 and nf2. These are great illustrative blog posts about normalizing flows containing model implementations.

BridgetteSong · 2021-06-16T10:49:17Z

Thank you for your reply.
According to my understanding, posterior distribution is Gaussian, and prior distribution is product of prior Gaussian and absolute value of the determinant (Equation 4). So KL loss is following:

1. q(z/x) = torch.distributions.normal.Normal(m_q, exp(logs_q))
2. p(z/c) = torch.distributions.normal.Normal(m_p, exp(logs_p)) * torch.abs(jacobian determinant)
3. kl_loss = torch.distributions.kl.kl_divergence(q(z/x), p(z/c))

is my understanding right? or is this kl_loss equal to your kl loss?
I will appreciate if you can give a detailed explanation.

jaywalnut310 · 2021-06-16T21:50:54Z

@BridgetteSong You're right. the posterior is gaussian, and the prior is product of Gaussian and the jacobian determinant.

Let me explain the kl loss in detail. For brevity, and without loss of generality, I'll assume the channel dimension of latent variables is one.

The kl divergence is the mean of the difference of log probabilities as follows:

mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)

As q(z/x) is gaussian, we can calculate the closed form mean of log(q(z/x)), which is the negative entropy of gaussian (see https://en.wikipedia.org/wiki/Normal_distribution):

mean of log(q(z/x)) = negative entropy of q(z|x) = -logs_q - 0.5 - 0.5 * log(2*pi)

On the other hand, the mean of log(p(z/c)) has no closed-form solution. So we have to calculate log(p(z/c)) for each sampled z and then average them out:

log(p(z/c)) = log(N(f(z)|m_p, logs_p))) + logdet(df/dz), where f is a normalizing flow.

As we constrain the normalizing flow of prior distribution is volume-preserving, which uses shift-only (=mean-only) operation in coupling layers, the jacobian determinant of prior is one (see

vits/models.py

Line 449 in 2e561ba

    
           self.flow = ResidualCouplingBlock(inter_channels, hidden_channels, 5, 1, 4, gin_channels=gin_channels)

and

vits/models.py

Lines 179 to 209 in 2e561ba

    
           class ResidualCouplingBlock(nn.Module): 
        
             def __init__(self, 
        
                 channels, 
        
                 hidden_channels, 
        
                 kernel_size, 
        
                 dilation_rate, 
        
                 n_layers, 
        
                 n_flows=4, 
        
                 gin_channels=0): 
        
               super().__init__() 
        
               self.channels = channels 
        
               self.hidden_channels = hidden_channels 
        
               self.kernel_size = kernel_size 
        
               self.dilation_rate = dilation_rate 
        
               self.n_layers = n_layers 
        
               self.n_flows = n_flows 
        
               self.gin_channels = gin_channels 
        
               self.flows = nn.ModuleList() 
        
               for i in range(n_flows): 
        
                 self.flows.append(modules.ResidualCouplingLayer(channels, hidden_channels, kernel_size, dilation_rate, n_layers, gin_channels=gin_channels, mean_only=True)) 
        
                 self.flows.append(modules.Flip()) 
        
             def forward(self, x, x_mask, g=None, reverse=False): 
        
               if not reverse: 
        
                 for flow in self.flows: 
        
                   x, _ = flow(x, x_mask, g=g, reverse=reverse) 
        
               else: 
        
                 for flow in reversed(self.flows): 
        
                   x = flow(x, x_mask, g=g, reverse=reverse) 
        
               return x

):

log(p(z/c)) = log(N(f(z)|m_p, logs_p))) + 0 = -logs_p - 0.5 * log(2*pi) - 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2

Then, kl = the average of ( negative entropy of q(z/x) - log(p(z/c))) is:

(logs_p - logs_q - 0.5) + 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2, where f(z) is z_p in our code.

This is the explanation of the kl loss (

vits/losses.py

Lines 57 to 60 in 2e561ba

    
           kl = logs_p - logs_q - 0.5 
        
           kl += 0.5 * ((z_p - m_p)**2) * torch.exp(-2. * logs_p) 
        
           kl = torch.sum(kl * z_mask) 
        
           l = kl / torch.sum(z_mask)

).

BridgetteSong · 2021-06-17T03:25:40Z

Thank you very much for your patience and detailed answer, I got it.

BridgetteSong · 2021-06-17T10:10:12Z

BTW, as the prior is product of Gaussian and the jacobian determinant, and considering properties of Gaussian distribution(if X ~ N(u, σ**2), aX+b ~ N(au+b, (aσ)**2)), so the prior is always a Gaussian distribution when the jacobian determinant is a constant. So can we calculate KL-divergence using abovementioned two Gaussian KL-divergence or use torch API to get KL-divergence directly like this?
kl_loss = torch.distributions.kl.kl_divergence(q(z/x), p(z/c))

jaywalnut310 · 2021-06-17T22:53:35Z

Good point! In case the channel dimension of latent variables is one, the prior is Gaussian when the jacobian determinant is a constant.
However, when the channel dimension of latent variables exceeds one, it's not true.
For example, Let (x1, x2) ~ N( (0, 0), I), then transform it into (y1, y2) = (x1, cos(x1) + x2).
Because of the non-linear transformation, the joint distribution of (y1, y2) is not Gaussian.
However, the jacobian determinant is still one, as the first order derivatives are : dy1/dx1 = 1, dy1/dx2 = 0, dy2/dx1 = -sin(x1), dy2/dx2 = 1.

The normalizing flow of prior also provides non-linear transformation using neural-networks while maintaining a constant jacobian determinant, resulting in non-Gaussian prior distribution. If the normalizing flow of prior only allows linear transformation or the channel dimension of latent variables is one, you can use the KL-divergence btw two Gaussians. But, in general, you cannot use it.

BridgetteSong · 2021-06-18T02:25:18Z

Thank you very much again. I totally understand now. I learned much from your detailed answer.

haoheliu · 2021-12-07T03:36:37Z

Hi @jaywalnut310, thanks for your detailed answer. Very helpful! I'd like to ask two more questions. I'd appreciate it if I can have your answers.

As we constrain the normalizing flow of prior distribution is volume-preserving, which uses shift-only (=mean-only) operation in coupling layers, the jacobian determinant of prior is one (see

You mentioned you set up the normalizing flow to be volume-preserving. Did this way benefit the model? In my understanding, it can be replaced by a more complicated non-volume-preserving flow model.

The kl divergence is the mean of the difference of log probabilities as follows:
mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)

As far as I know, the KL divergence value lies in the range between [0,+inf]. But according to your formula, its value could be negative? (ref: https://towardsdatascience.com/light-on-math-machine-learning-intuitive-guide-to-understanding-kl-divergence-2b382ca2b2a8)

candlewill · 2022-05-06T11:56:12Z

How about taking absolute value to overcome kl loss is negative?

--- a/losses.py
+++ b/losses.py
@@ -54,7 +54,7 @@ def kl_loss(z_p, logs_q, m_p, logs_p, z_mask):
   logs_p = logs_p.float()
   z_mask = z_mask.float()
 
-  kl = logs_p - logs_q - 0.5
+  kl = torch.abs(logs_p - logs_q - 0.5)

yanggeng1995 · 2022-09-09T07:56:48Z

@BridgetteSong You're right. the posterior is gaussian, and the prior is product of Gaussian and the jacobian determinant.

Let me explain the kl loss in detail. For brevity, and without loss of generality, I'll assume the channel dimension of latent variables is one.

The kl divergence is the mean of the difference of log probabilities as follows:

mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)

As q(z/x) is gaussian, we can calculate the closed form mean of log(q(z/x)), which is the negative entropy of gaussian (see https://en.wikipedia.org/wiki/Normal_distribution):

mean of log(q(z/x)) = negative entropy of q(z|x) = -logs_q - 0.5 - 0.5 * log(2*pi)

On the other hand, the mean of log(p(z/c)) has no closed-form solution. So we have to calculate log(p(z/c)) for each sampled z and then average them out:

log(p(z/c)) = log(N(f(z)|m_p, logs_p))) + logdet(df/dz), where f is a normalizing flow.

As we constrain the normalizing flow of prior distribution is volume-preserving, which uses shift-only (=mean-only) operation in coupling layers, the jacobian determinant of prior is one (see

vits/models.py

Line 449 in 2e561ba

self.flow = ResidualCouplingBlock(inter_channels, hidden_channels, 5, 1, 4, gin_channels=gin_channels)

and

vits/models.py

Lines 179 to 209 in 2e561ba

class ResidualCouplingBlock(nn.Module):

def __init__(self,

channels,

hidden_channels,

kernel_size,

dilation_rate,

n_layers,

n_flows=4,

gin_channels=0):

super().__init__()

self.channels = channels

self.hidden_channels = hidden_channels

self.kernel_size = kernel_size

self.dilation_rate = dilation_rate

self.n_layers = n_layers

self.n_flows = n_flows

self.gin_channels = gin_channels

self.flows = nn.ModuleList()

for i in range(n_flows):

self.flows.append(modules.ResidualCouplingLayer(channels, hidden_channels, kernel_size, dilation_rate, n_layers, gin_channels=gin_channels, mean_only=True))

self.flows.append(modules.Flip())

def forward(self, x, x_mask, g=None, reverse=False):

if not reverse:

for flow in self.flows:

x, _ = flow(x, x_mask, g=g, reverse=reverse)

else:

for flow in reversed(self.flows):

x = flow(x, x_mask, g=g, reverse=reverse)

return x

):

log(p(z/c)) = log(N(f(z)|m_p, logs_p))) + 0 = -logs_p - 0.5 * log(2*pi) - 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2

Then, kl = the average of ( negative entropy of q(z/x) - log(p(z/c))) is:

(logs_p - logs_q - 0.5) + 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2, where f(z) is z_p in our code.

This is the explanation of the kl loss (

vits/losses.py

Lines 57 to 60 in 2e561ba

kl = logs_p - logs_q - 0.5

kl += 0.5 * ((z_p - m_p)**2) * torch.exp(-2. * logs_p)

kl = torch.sum(kl * z_mask)

l = kl / torch.sum(z_mask)

).

Hi, I'm a bit confused “ The kl divergence is the mean of the difference of log probabilities as follows: mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)”，doesn't the kl divergence need integration？ You directly kl=mean(log(q(z/x))) - mean(log(p(z/c)))，is this an approximate formula？In this case, isn't it more convenient to calculate the negative log likelihood based on z_p obtained according to flow transformation， m_p and logs_p?

BridgetteSong · 2022-09-09T09:08:39Z

Hi, I'm a bit confused “ The kl divergence is the mean of the difference of log probabilities as follows: mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)”，doesn't the kl divergence need integration？ You directly kl=mean(log(q(z/x))) - mean(log(p(z/c)))，is this an approximate formula？In this case, isn't it more convenient to calculate the negative log likelihood based on z_p obtained according to flow transformation， m_p and logs_p?

@yanggeng1995 I will give some additional supplements:

KL_loss = ∫q(z/x) * (log(q(z/x)) - log(p(z/c)))dz = ∫q(z/x) * log(q(z/x))dz - ∫q(z/x) * log(p(z/c)))dz
as q(z/x) is gaussian, so ∫q(z/x) * log(q(z/x))dz = -logs_q - 0.5 - 0.5 * log(2*pi).
we can't directly compute ∫q(z/x) * log(p(z/c)))dz, so we only approximately compute it by sampling method. as usually, we sample some z values and average them. In the VAE code, usually sampling one z is enough, so ∫q(z/x) * log(p(z/c)))dz ≈ mean(log(p(z/c))) = log(p(z/c)).

yanggeng1995 · 2022-09-09T09:15:21Z

@BridgetteSong You're right. the posterior is gaussian, and the prior is product of Gaussian and the jacobian determinant.
Let me explain the kl loss in detail. For brevity, and without loss of generality, I'll assume the channel dimension of latent variables is one.
The kl divergence is the mean of the difference of log probabilities as follows:

mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)

As q(z/x) is gaussian, we can calculate the closed form mean of log(q(z/x)), which is the negative entropy of gaussian (see https://en.wikipedia.org/wiki/Normal_distribution):

mean of log(q(z/x)) = negative entropy of q(z|x) = -logs_q - 0.5 - 0.5 * log(2*pi)

On the other hand, the mean of log(p(z/c)) has no closed-form solution. So we have to calculate log(p(z/c)) for each sampled z and then average them out:

log(p(z/c)) = log(N(f(z)|m_p, logs_p))) + logdet(df/dz), where f is a normalizing flow.

As we constrain the normalizing flow of prior distribution is volume-preserving, which uses shift-only (=mean-only) operation in coupling layers, the jacobian determinant of prior is one (see

vits/models.py

Line 449 in 2e561ba

self.flow = ResidualCouplingBlock(inter_channels, hidden_channels, 5, 1, 4, gin_channels=gin_channels)

and

vits/models.py

Lines 179 to 209 in 2e561ba

class ResidualCouplingBlock(nn.Module):

def __init__(self,

channels,

hidden_channels,

kernel_size,

dilation_rate,

n_layers,

n_flows=4,

gin_channels=0):

super().__init__()

self.channels = channels

self.hidden_channels = hidden_channels

self.kernel_size = kernel_size

self.dilation_rate = dilation_rate

self.n_layers = n_layers

self.n_flows = n_flows

self.gin_channels = gin_channels

self.flows = nn.ModuleList()

for i in range(n_flows):

self.flows.append(modules.ResidualCouplingLayer(channels, hidden_channels, kernel_size, dilation_rate, n_layers, gin_channels=gin_channels, mean_only=True))

self.flows.append(modules.Flip())

def forward(self, x, x_mask, g=None, reverse=False):

if not reverse:

for flow in self.flows:

x, _ = flow(x, x_mask, g=g, reverse=reverse)

else:

for flow in reversed(self.flows):

x = flow(x, x_mask, g=g, reverse=reverse)

return x

):

log(p(z/c)) = log(N(f(z)|m_p, logs_p))) + 0 = -logs_p - 0.5 * log(2*pi) - 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2

Then, kl = the average of ( negative entropy of q(z/x) - log(p(z/c))) is:

(logs_p - logs_q - 0.5) + 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2, where f(z) is z_p in our code.

This is the explanation of the kl loss (

vits/losses.py

Lines 57 to 60 in 2e561ba

kl = logs_p - logs_q - 0.5

kl += 0.5 * ((z_p - m_p)**2) * torch.exp(-2. * logs_p)

kl = torch.sum(kl * z_mask)

l = kl / torch.sum(z_mask)

).

Hi, I'm a bit confused “ The kl divergence is the mean of the difference of log probabilities as follows: mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)”，doesn't the kl divergence need integration？ You directly kl=mean(log(q(z/x))) - mean(log(p(z/c)))，is this an approximate formula？In this case, isn't it more convenient to calculate the negative log likelihood based on z_p obtained according to flow transformation， m_p and logs_p?

@yanggeng1995 I will give some additional supplements:

KL_loss = ∫q(z/x) * (log(q(z/x)) - log(p(z/c)))dx = ∫q(z/x) * log(q(z/x))dx - ∫q(z/x) * log(p(z/c)))dx

as q(z/x) is gaussian, so ∫q(z/x) * log(q(z/x))dx = -logs_q - 0.5 - 0.5 * log(2*pi).

we can't directly compute ∫q(z/x) * log(p(z/c)))dx, so we only approximately compute it by sampling method. as usually, we sample some z values and mean of them. In the VAE code, usually sampling one z is enough, so ∫q(z/x) * log(p(z/c)))dx ≈ mean(log(p(z/c))) = log(p(z/c)).

@BridgetteSong Thanks for your answer. There is another question, why not calculate the negative log-likelihood of Gaussian distribution based on z_p, m_p and logs_p, isn't it more convenient?

BridgetteSong · 2022-09-09T10:11:19Z

@yanggeng1995 it is very convenient to compute ∫q(z/x) * log(q(z/x))dz as it is Gaussian. And as for ∫q(z/x) * log(p(z/c)))dz, it is also convenient to compute it if you understand approximate sampling: ∫q(z/x) * log(p(z/c)))dz ≈ log(p(z/c)).

p(z/c) is product of Gaussian and the jacobian determinant. To compute log(p(z/c)), we need first sample z ~ posterior(), and get z_p = NormalizedFlow(z), finally use z_p to compute log-likelihood of prior Gaussian: N(z_p, m_p, logs_p).

so log(p(z/c)) = logdet(df/dz) + log(N(z_p, m_p, logs_p)) = 0 - logs_p - 0.5 * log(2*pi) - 0.5 * exp(-2 * logs_p) * (f(z) - m_p) ** 2.

I think it is also right if you directly use kl_loss ≈ log(q(z/x)) - log(p(z/c)), just log(q(z/x)) = - logs_q - 0.5 * log(2*pi) - 0.5 * exp(-2 * logs_q) * (z - m_q) ** 2 where z ~ posterior(m_q, logs_q), and log(p(z/c)) is the same. I think the author's method is more concise and more accurate.

980202006 · 2023-03-15T10:44:12Z

How about taking absolute value to overcome kl loss is negative?

--- a/losses.py
+++ b/losses.py
@@ -54,7 +54,7 @@ def kl_loss(z_p, logs_q, m_p, logs_p, z_mask):
   logs_p = logs_p.float()
   z_mask = z_mask.float()
 
-  kl = logs_p - logs_q - 0.5
+  kl = torch.abs(logs_p - logs_q - 0.5)

Hi, Is it work?

BridgetteSong · 2023-07-14T11:18:30Z

@980202006 It will not work. As usually, KL_loss will not be negative if your inputs and network are right. When kl_loss < 0, it means your prior distribution is almost same as posterior distribution, so posterior distribution fails to learn as a complicated distribution.
When kl_loss < 0, The first thing you should to do is checking your inputs and network. If you must add some constraints in the loss formula, you should add abs function to all items not first item like this:

kl = logs_p - logs_q - 0.5
kl += 0.5 * ((z_p - m_p)**2) * torch.exp(-2. * logs_p)
kl = torch.clamp(kl, min=0.0)

But in usually, you need not add this constraint, as when your KL_Loss < 0, it means your network is trained unsuccessfully, although you add this constraint, you can't get right results.

fenling · 2023-09-14T06:11:33Z

Hi, I'm a bit confused “ The kl divergence is the mean of the difference of log probabilities as follows: mean(log(q(z/x))) - mean(log(p(z/c))), where z ~ q(z/x)”，doesn't the kl divergence need integration？ You directly kl=mean(log(q(z/x))) - mean(log(p(z/c)))，is this an approximate formula？In this case, isn't it more convenient to calculate the negative log likelihood based on z_p obtained according to flow transformation， m_p and logs_p?

@yanggeng1995 I will give some additional supplements:

KL_loss = ∫q(z/x) * (log(q(z/x)) - log(p(z/c)))dz = ∫q(z/x) * log(q(z/x))dz - ∫q(z/x) * log(p(z/c)))dz

as q(z/x) is gaussian, so ∫q(z/x) * log(q(z/x))dz = -logs_q - 0.5 - 0.5 * log(2*pi).

we can't directly compute ∫q(z/x) * log(p(z/c)))dz, so we only approximately compute it by sampling method. as usually, we sample some z values and average them. In the VAE code, usually sampling one z is enough, so **_∫q(z/x) * log(p(z/c)))dz ≈ mean(log(p(z/c))) = log(p(z/c)).
@BridgetteSong hi，i want to know why mean(log(p(z/c))) = log(p(z/c)).sampling one z is enought,why？

Cheneng · 2023-12-21T09:01:05Z

@980202006 It will not work. As usually, KL_loss will not be negative if your inputs and network are right. When kl_loss < 0, it means your prior distribution is almost same as posterior distribution, so posterior distribution fails to learn as a complicated distribution. When kl_loss < 0, The first thing you should to do is checking your inputs and network. If you must add some constraints in the loss formula, you should add abs function to all items not first item like this:

kl = logs_p - logs_q - 0.5

kl += 0.5 * ((z_p - m_p)**2) * torch.exp(-2. * logs_p)

kl = torch.clamp(kl, min=0.0)

But in usually, you need not add this constraint, as when your KL_Loss < 0, it means your network is trained unsuccessfully, although you add this constraint, you can't get right results.

A larger batch size may help : ) , I think there are some abnormal or extreme point may ruin your model. And once I enlarge my batch size, problem disappear

jaywalnut310 added the good first issue Good for newcomers label Jun 16, 2021

BridgetteSong closed this as completed Jun 17, 2021

nanma mentioned this issue Apr 14, 2022

How is the KL loss computed? #46

Open

AndreyBocharnikov mentioned this issue Apr 30, 2022

Could you please explain the KL loss in losses.py #55

Closed

lexkoro mentioned this issue Jan 6, 2023

Question about VITS KL Loss Formula #112

Open

kan-bayashi mentioned this issue Mar 29, 2023

Question about VITS KL Loss Formula espnet/espnet#4853

Closed

daniilrobnikov mentioned this issue Aug 6, 2023

Question about KL Divergence loss function 关于KL散度的损失函数的问题 RVC-Project/Retrieval-based-Voice-Conversion-WebUI#469

Closed

nshmyrev mentioned this issue Aug 21, 2023

About KL Divergence loss #155

Open

p0p4k mentioned this issue Sep 7, 2023

Just curious, for line 528 and line 530 in models.py, why return x without calculation? p0p4k/vits2_pytorch#45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KL Loss is right? #6

KL Loss is right? #6

BridgetteSong commented Jun 15, 2021 •

edited

jaywalnut310 commented Jun 15, 2021 •

edited

BridgetteSong commented Jun 16, 2021

jaywalnut310 commented Jun 16, 2021 •

edited

BridgetteSong commented Jun 17, 2021

BridgetteSong commented Jun 17, 2021

jaywalnut310 commented Jun 17, 2021

BridgetteSong commented Jun 18, 2021

haoheliu commented Dec 7, 2021

candlewill commented May 6, 2022

yanggeng1995 commented Sep 9, 2022

BridgetteSong commented Sep 9, 2022 •

edited

yanggeng1995 commented Sep 9, 2022 •

edited

BridgetteSong commented Sep 9, 2022 •

edited

980202006 commented Mar 15, 2023

BridgetteSong commented Jul 14, 2023

fenling commented Sep 14, 2023

Cheneng commented Dec 21, 2023 •

edited

KL Loss is right? #6

KL Loss is right? #6

Comments

BridgetteSong commented Jun 15, 2021 • edited

jaywalnut310 commented Jun 15, 2021 • edited

BridgetteSong commented Jun 16, 2021

jaywalnut310 commented Jun 16, 2021 • edited

BridgetteSong commented Jun 17, 2021

BridgetteSong commented Jun 17, 2021

jaywalnut310 commented Jun 17, 2021

BridgetteSong commented Jun 18, 2021

haoheliu commented Dec 7, 2021

candlewill commented May 6, 2022

yanggeng1995 commented Sep 9, 2022

BridgetteSong commented Sep 9, 2022 • edited

yanggeng1995 commented Sep 9, 2022 • edited

BridgetteSong commented Sep 9, 2022 • edited

980202006 commented Mar 15, 2023

BridgetteSong commented Jul 14, 2023

fenling commented Sep 14, 2023

Cheneng commented Dec 21, 2023 • edited

BridgetteSong commented Jun 15, 2021 •

edited

jaywalnut310 commented Jun 15, 2021 •

edited

jaywalnut310 commented Jun 16, 2021 •

edited

BridgetteSong commented Sep 9, 2022 •

edited

yanggeng1995 commented Sep 9, 2022 •

edited

BridgetteSong commented Sep 9, 2022 •

edited

Cheneng commented Dec 21, 2023 •

edited