Skip to content

Commit

Permalink
Merge pull request #179 from reimandlab/fix-duplicates-hiv
Browse files Browse the repository at this point in the history
Select the less significant call in case of duplicate HIV calls
  • Loading branch information
krassowski committed Feb 8, 2022
2 parents cf7ee22 + ba7507f commit e47050b
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions website/imports/sites/infections.py
Expand Up @@ -441,6 +441,15 @@ def load_sites(self, file_path='data/sites/2016_Greenwood/elife-18296-fig6-data1
'Log2 (fold change) HIV WT vs Mock': 'effect_size'
}, inplace=True)

# some combinations of PTMs were called multiple times for a single peptide;
# as we have no way to select the most likely call, we conservatively choose
# the one which has the least significant results (to avoid inflating FDR)
sites = (
sites
.sort_values('adj_p_val', ascending=False)
.drop_duplicates(subset=['protein_accession', 'residue', 'position'], keep='first')
)

mapped_sites = self.process_event_associated_sites(
sites,
canonical=CANONICAL_PHOSPHOSITE_RESIDUES
Expand Down

0 comments on commit e47050b

Please sign in to comment.