e.g. with KCNK1
from cancermuts.datasources import UniProt
# create the corresponding uniprot object
up = UniProt()
# alternatively, we can specifically ask for a Uniprot ID
seq = up.get_sequence('KCNK1', upid='KCNK1_HUMAN')
from cancermuts.datasources import PhosphoSite, dbPTM, GlyGen, MobiDB, NetPhos
# add annotations from PhosphoSite
ps = PhosphoSite('/data/databases/phosphosite/')
ps.add_position_properties(seq)
print(seq.positions[4].properties)
print(seq.positions[28].properties)
# add annotations from dbPTM
db = dbPTM('/data/databases/dbPTM/')
db.add_position_properties(seq)
# add annotations from GlyGen
gg = GlyGen(
'/data/databases/GlyGen/',
database_file='human_proteoform_glycosylation_sites_uniprotkb_filtered.csv'
)
gg.add_position_properties(seq)
# add annotations from NetPhos
np = NetPhos('/data/databases/netphos_human_proteome/netphos_human_isoforms/raw/')
np.add_position_properties(seq)
# save table
from cancermuts.table import Table
tbl = Table()
df = tbl.to_dataframe(seq)
we have entries in both dbPTM and GlyGen:
$ grep O00180 /data/databases/dbPTM/N-linked_Glycosylation
KCNK1_HUMAN O00180 95 N-linked Glycosylation 11053038;8978667 ASNYGVSVLSNASGNWNWDFT
$ grep O00180 /data/databases/GlyGen/human_proteoform_glycosylation_sites_uniprotkb_filtered.csv
O00180-1,95,Asn,,N-linked,protein_xref_uniprotkb_gly,O00180,protein_xref_uniprotkb_gly,O00180,N-linked (GlcNAc...) asparagine,PubMed,ECO_0000269,,GlcNAc...,,NAS,NXS,95,95,Asn,Asn,N
O00180-1,95,Asn,,N-linked,protein_xref_pubmed,8978667,protein_xref_uniprotkb_gly,O00180,N-linked (GlcNAc...) asparagine,PubMed,ECO_0000269,,GlcNAc...,,NAS,NXS,95,95,Asn,Asn,N
however in standard output we get:
Wed May 27 10:37:42 2026 INFO added property <PositionProperty Glycosylation Site from dbPTM>
Wed May 27 10:37:42 2026 INFO property <PositionProperty Glycosylation Site from dbPTM> was replaced with <PositionProperty Glycosylation Site from GlyGen>
Wed May 27 10:37:42 2026 INFO property <PositionProperty Glycosylation Site from GlyGen> was replaced with <PositionProperty Glycosylation Site from GlyGen>
in the final csv file:
94,95,N,,,,,,,,,Gly,N-GlcNAc,,GlyGen,,,,,,,,,,,,,,,,,,,,,,,,
so we are at least missing the dbPTM source in the output.
N95 is the only Glycosilation site for this protein
e.g. with KCNK1
we have entries in both dbPTM and GlyGen:
however in standard output we get:
in the final csv file:
so we are at least missing the dbPTM source in the output.
N95 is the only Glycosilation site for this protein