The difference between paper accepted by ISBM2022 and code in this project #145

qiyueliuhuo23 · 2023-03-10T08:50:36Z

In the paper published in ISBM 2022: De novo mass spectrometry peptide sequencing with a transformer model, the input embedding is

But in the code in this project, you use depthcharge package (specifically, class MassEncoder from depthcharge/components/encoder.py/MassEncoder) to finish the input embedding, and the specific complement is shown below:

if min_wavelength:
            base = min_wavelength / (2 * np.pi)
            scale = max_wavelength / min_wavelength
 else:
            base = 1
            scale = max_wavelength / (2 * np.pi)

sin_term = base * scale ** (
            torch.arange(0, n_sin).float() / (n_sin - 1)
        )
cos_term = base * scale ** (
            torch.arange(0, n_cos).float() / (n_cos - 1)
        )

According to the code, the embedding f can be formulated as below:

I am wandering why there is a difference between the paper and the code? Is that because the code way can have a better performance? If you could tell me, I would be very grateful. Thanks!

melihyilmaz · 2023-03-10T22:20:14Z

Hi,

Thanks for bringing this to our attention, the formulation in the paper is correct but the current Depthcharge implementation is incorrect. We will work on fixing the implementation and release a new model.

liangzhendong123 · 2023-03-21T03:08:43Z

about that, do I think it is a mistake in the paper, but the original version of depthcharge is correct? Because in my understanding, base=min_wavelength / (2 * np.pi) is used to control the minimum resolvable resolution in mz, and scale=max_wavelength/min_wavelength is similar to the 10000 chosen in the original transformer? I don't know if I understand this correctly? If there is a mistake, please point it out, thank you.

wfondrie · 2023-03-21T05:56:36Z

Hi @qiyueliuhuo23 and @liangzhendong123 - thank you so much for looking at our manuscript and code so closely! I've opened a PR fixing the implementation in depthcharge (see wfondrie/depthcharge#28), although the correct equation is a little different:

... I have updated the formula to the following, where $\lambda_{\text{max}}$ is the maximum wavelength, $\lambda_{\text{min}}$ is the minimum wavelength, $i$ is the index of the feature (zero-based), and $d$ is one less than the number of features---also referred to as the dimensionality of the model.

$$ f_{i}= \left. \begin{cases} \sin(m_{j} / (\frac{\lambda_{\text{min}}}{2\pi}(\frac{\lambda_{\text{max}}}{\lambda_{min}})^{2i/d})), & \text{for } i \leq d/2 \\ \cos(m_{j} / (\frac{\lambda_{\text{min}}}{2\pi}(\frac{\lambda_{\text{max}}}{\lambda_{min}})^{2i/d - 1})), & \text{for } i > d/2 \\ \end{cases} \right. $$

As an example, if we use $\lambda_{\text{min}} = 0.1$, $\lambda_{\text{max}} = 10$, and $d = 4$, then the first sinusoidal feature, $i = 0$ will evaluate to a wavelength of 0.1:

$$ sin(m_{j} / (\frac{0.1}{2\pi}(\frac{10}{0.1})^{2*0/4} ) = sin(2\pi * m_{j} / 0.1) $$

The last sinusoidal feature, $i = 4$ will evaluate to a wavelength of 10:

$$ cos(m_{j} / (\frac{0.1}{2\pi}(\frac{10}{0.1})^{2*4/4 - 1} ) = cos(2\pi * m_{j} / 10) $$

liangzhendong123 · 2023-03-21T09:15:20Z

👀so just for sure, the original depthcharge is right? If we do not consider the little difference in cosine?

qiyueliuhuo23 · 2023-03-21T09:58:06Z

👀so just for sure, the original depthcharge is right? If we do not consider the little difference in cosine?

I agree with you, according to what @wfondrie have said

qiyueliuhuo23 · 2023-03-21T10:01:36Z

Hi @liangzhendong123 - thank you so much for looking at our manuscript and code so closely! I've opened a PR fixing the implementation in depthcharge (see wfondrie/depthcharge#28), although the correct equation is a little different:

... I have updated the formula to the following, where λmax is the maximum wavelength, λmin is the minimum wavelength, i is the index of the feature (zero-based), and d is one less than the number of features---also referred to as the dimensionality of the model.
fi={sin⁡(mj/(λmin2π(λmaxλmin)2i/d)),for i≤d/2cos⁡(mj/(λmin2π(λmaxλmin)2i/d−1)),for i>d/2
As an example, if we use λmin=0.1, λmax=10, and d=4, then the first sinusoidal feature, i=0 will evaluate to a wavelength of 0.1:
sin(mj/(0.12π(100.1)2∗0/4)=sin(2π∗mj/0.1)
The last sinusoidal feature, i=4 will evaluate to a wavelength of 10:
cos(mj/(0.12π(100.1)2∗4/4−1)=cos(2π∗mj/10)

It's embrassing to say, but I found that mistake first, but you didn't mention that at all. It makes me a little unhappy. But its good for you to fix that bug and reply to the question promptly. Thanks a lot.

bittremieux · 2023-03-21T13:36:32Z

Apologies @qiyueliuhuo23, this was simply confusion because both your (default) avatars look pretty similar. Thank you both for pointing out this issue and the discussion.

Please see the proposed fixes in wfondrie/depthcharge#28 on how we'll update both the formulation in the paper and the code in depthcharge. We're working on implementing this correctly in Casanovo.

wfondrie · 2023-03-21T15:15:51Z

Hi @qiyueliuhuo23 - sorry for the confusion. I've gone back and updated the comments and PR. 🙏

bittremieux · 2023-04-14T09:56:37Z

We have now updated the code to use the correct sinusoidal encoding and will release a new version of Casanovo and the corresponding weights after retraining soon.

bittremieux added the question Further information is requested label Mar 10, 2023

wfondrie mentioned this issue Mar 20, 2023

[bug] m/z encoding is incorrect wfondrie/depthcharge#27

Closed

wfondrie mentioned this issue Mar 21, 2023

Fix the sinusoidal encoders wfondrie/depthcharge#28

Merged

bittremieux closed this as completed Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The difference between paper accepted by ISBM2022 and code in this project #145

The difference between paper accepted by ISBM2022 and code in this project #145

qiyueliuhuo23 commented Mar 10, 2023

melihyilmaz commented Mar 10, 2023

liangzhendong123 commented Mar 21, 2023

wfondrie commented Mar 21, 2023 •

edited

Loading

liangzhendong123 commented Mar 21, 2023

qiyueliuhuo23 commented Mar 21, 2023

qiyueliuhuo23 commented Mar 21, 2023

bittremieux commented Mar 21, 2023

wfondrie commented Mar 21, 2023

bittremieux commented Apr 14, 2023

The difference between paper accepted by ISBM2022 and code in this project #145

The difference between paper accepted by ISBM2022 and code in this project #145

Comments

qiyueliuhuo23 commented Mar 10, 2023

melihyilmaz commented Mar 10, 2023

liangzhendong123 commented Mar 21, 2023

wfondrie commented Mar 21, 2023 • edited Loading

liangzhendong123 commented Mar 21, 2023

qiyueliuhuo23 commented Mar 21, 2023

qiyueliuhuo23 commented Mar 21, 2023

bittremieux commented Mar 21, 2023

wfondrie commented Mar 21, 2023

bittremieux commented Apr 14, 2023

wfondrie commented Mar 21, 2023 •

edited

Loading