Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding different Modifications to a recurrent a.a in a sequence at different positions to get fragment ions. #14

Open
HijaziHassan opened this issue Jun 10, 2023 · 2 comments

Comments

@HijaziHassan
Copy link

HijaziHassan commented Jun 10, 2023

Hello ,

PSMatch::calculateFragments("KLATTQKLCPIR",
                            z= 1:2,
                            type=c("b", "y"),
                            modifications=c(K=16.023, Nterm = 16.023)
                            )

Using this dummy example, the assigned modification at K will be added to all the Ks in the sequence. However, I want it to only be added to K7 and not K1.
Is there an argument that forces a position-specific modification?

@sgibb
Copy link
Member

sgibb commented Jun 10, 2023

Unfortunately it is not possible.

If you don't care about neutral loss you could use the following (dirty) way:
(In general we don't recommend to overwrite internal functions because this could yield unpredictable behaviour.)

  1. It greps the amino acid data.frame.
  2. Adds a special "k" that is identical to the original "K".
  3. Overwrites the internal function to fetch amino acids.
  4. Replace the "K" on pos 7 with "k".
  5. Apply the modification to "k" instead of "K".
library("PSMatch")

## get original amino acids data.frame
aa <- PSMatch:::getAminoAcids()

## add special "k"
aa <- rbind(aa, aa[aa$AA == "K",])
aa$AA[nrow(aa)] <- "k"

## add function to overwrite internal one
getAminoAcids.k <- function() {
    aa
}

# original output
head(
PSMatch::calculateFragments("KLATTQKLCPIR",
                            z = 1:2,
                            type = c("b", "y"),
                            modifications = c(K = 16.023, Nterm = 16.023),
                            neutralLoss = NULL
                            )
)
#> Modifications used: K=16.023, Nterm=16.023
#>          mz ion type pos z seq
#> 1 161.14824  b1    b   1 1   K
#> 2  81.07776  b1    b   1 2   K
#> 3 274.23230  b2    b   2 1  KL
#> 4 137.61979  b2    b   2 2  KL
#> 5 345.26941  b3    b   3 1 KLA
#> 6 173.13834  b3    b   3 2 KLA

tail(
PSMatch::calculateFragments("KLATTQKLCPIR",
                            z = 1:2,
                            type = c("b", "y"),
                            modifications = c(K = 16.023, Nterm = 16.023),
                            neutralLoss = NULL
                            )
)
#> Modifications used: K=16.023, Nterm=16.023
#>           mz ion type pos z         seq
#> 39 1075.6209  y9    y   9 1   TTQKLCPIR
#> 40  538.3141  y9    y   9 2   TTQKLCPIR
#> 41 1146.6580 y10    y  10 1  ATTQKLCPIR
#> 42  573.8327 y10    y  10 2  ATTQKLCPIR
#> 43 1259.7421 y11    y  11 1 LATTQKLCPIR
#> 44  630.3747 y11    y  11 2 LATTQKLCPIR

## overwrite internal function
assignInNamespace("getAminoAcids", value = getAminoAcids.k, ns = "PSMatch")

## changed output
head(
PSMatch::calculateFragments("KLATTQkLCPIR",
                            z = 1:2,
                            type = c("b", "y"),
                            modifications = c(k = 16.023, Nterm = 16.023),
                            neutralLoss = NULL
                            )
)
#> Modifications used: k=16.023, Nterm=16.023
#>          mz ion type pos z seq
#> 1 145.12524  b1    b   1 1   K
#> 2  73.06626  b1    b   1 2   K
#> 3 258.20930  b2    b   2 1  KL
#> 4 129.60829  b2    b   2 2  KL
#> 5 329.24641  b3    b   3 1 KLA
#> 6 165.12684  b3    b   3 2 KLA

tail(
PSMatch::calculateFragments("KLATTQkLCPIR",
                            z = 1:2,
                            type = c("b", "y"),
                            modifications = c(k = 16.023, Nterm = 16.023),
                            neutralLoss = NULL
                            )
)
#> Modifications used: k=16.023, Nterm=16.023
#>           mz ion type pos z         seq
#> 39 1075.6209  y9    y   9 1   TTQkLCPIR
#> 40  538.3141  y9    y   9 2   TTQkLCPIR
#> 41 1146.6580 y10    y  10 1  ATTQkLCPIR
#> 42  573.8327 y10    y  10 2  ATTQkLCPIR
#> 43 1259.7421 y11    y  11 1 LATTQkLCPIR
#> 44  630.3747 y11    y  11 2 LATTQkLCPIR

Created on 2023-06-10 with reprex v2.0.2

The rules for neutral loss calculation are hard coded. If you really need this you could overwrite PSMatch:::.neutralLoss in a similar manner.

If you have a clever solution for an interface for modifications on a specific position (meaning a consists naming of arguments etc.) you are welcome to share your thoughts. This feature was planned but because of a good interface we never implemented it.

@HijaziHassan
Copy link
Author

HijaziHassan commented Jun 10, 2023

Thank you Sebastain for suggesting this workaround.

Not that I know off.
But I think the best would be similar to what is done in spectrum_utils in python; they use the ProForma notation to annotate spectra. Which must include an underlying fragment ions generation based on the modifications added in the sequence itself (either by name: EM[Oxidation]EVEES[Phospho]PEK or by CV accession: EM[MOD:00719]EVEES[MOD:00046]PEK or by by their delta mass: EM[+15.9949]EVEES[+79.9663]PEK.

From my side,
I tried many ways before I wrote this post since I expected that this should be already baked in, like:

adding multiple Ks

modifications = c(K = 10, K=20)

expecting that the first argument will be matched to the first K, and the second modification to the second K.
or maybe adding numbers after the a.a. like: modifications = c(K1 .. and K7 = ...)

I also tried to give each modification a numeric vector, so here you will have a list of list:

modifications = c(K= c(10, 20))

for sure neither of them worked and are not neat. But still no error was flagged!

I will see if I can manage to write something considering the ProForma style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants