<a href="https://colab.research.google.com/github/vprobon/iLIR-ML-data/blob/main/iLIR_ML_v0_9.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# iLIR-ML-v0.9: A Machine Learning Method for Prediction of functional LIR motifs from protein sequences and AlphaFold2-based predicted Intrinsic Disorder

Currently accepted input is a UniProt identifier.

**Note:** To run a prediction you may either
> - run in sequence all the cells below and enter the UniProt ID of your interest in the text area in the last cell, or
> -select ```Runtime->Run all``` and go directly to the last cell to enter the UniProt ID.

You may use the following link to navigate directly to the [input form](https://colab.research.google.com/drive/1yWIE_s6r07OoOuIa1fXt8_MZeC89S0QE#scrollTo=NjeyHYF6b1s8&line=1&uniqifier=1).

# Setup

Start by running the cells below to initialize the environment and define utility functions

In [1]:
# Clear working directory
!rm -rf sample_data
!rm -rf iLIR-ML*
!rm *.pkl

rm: cannot remove '*.pkl': No such file or directory


In [2]:
# Copy necessary files from GitHUb
!wget 'https://github.com/vprobon/iLIR-ML-data/raw/main/iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip'
!unzip 'iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip'
!rm 'iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip'
!ln -s  '/content/iLIR-ML(AF2_newPSSM_pLIRm)' workdir

--2024-06-17 15:00:07--  https://github.com/vprobon/iLIR-ML-data/raw/main/iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/vprobon/iLIR-ML-data/main/iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip [following]
--2024-06-17 15:00:07--  https://raw.githubusercontent.com/vprobon/iLIR-ML-data/main/iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2781566 (2.7M) [application/zip]
Saving to: ‘iLIR-ML(AF2_newPSSM_pLIRm)-20240617T031502Z-001.zip’


2024-06-17 15:00:07 (34.9 MB/s) - ‘iLIR-ML(AF2_newPSS

In [3]:
# Utility functions

import requests
import json
import csv
# Fetch AlphaFold-disorder predictions from MobiDB
def query_mobidb(uniprot_id):
    url = f"https://mobidb.bio.unipd.it/api/download?acc={uniprot_id}&format=json"
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        print(response.status_code)
        return None

# Function to extract prediction-disorder-alphafold values from JSON
def extract_alphafold_disorder(json_data):
    if 'prediction-disorder-alphafold' in json_data:
        return json_data['prediction-disorder-alphafold']['scores']
    else:
        return None

# Function to extract sequence from JSON
def extract_sequence(json_data):
    if 'sequence' in json_data:
        return json_data['sequence']
    else:
        return None

In [4]:
# PSSM related functions

import pandas as pd


# Load LIRcentral PSSMs from disk files
c= [(9,12),(12,15),(15,18),(18,21),(21,24),(24,27),(27,30),(30,33),(33,36),(36,39),(39,42),
            (42,45),(45,48),(48,51),(51,54),(54,57),(57,60),(60,63),(63,66), (66,69)
            ]

pssm_pos = pd.read_fwf('./workdir/positive.pssm', colspecs=c, skiprows=2, skipfooter=4)
pssm_neg = pd.read_fwf('./workdir/negative.pssm', colspecs=c, skiprows=2, skipfooter=4)
##

AAlist = "A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V".split(" ")

# For LIRcentral PSSMs
def pssm_score(seq,pssm):
  '''
  This function can score sequences (partially) against a PSSM.
  Partial scoring can be achieved by padding sequence appropriately
  with characters not in the ammino acid alphabet
  '''

  score=0
  for i in range(0,len(seq)):
    # Silently skip unknown characters (with zero score)
    if seq[i] in AAlist:
      score += pssm[seq[i]][i]
  return(score)

# Legacy PSSM (Kalvari et al., 2014)
# Takes same sequence used to calculate pLIRm scores
pssm_kalvari = [{'X':0,'U':0,'A':0,'R':1,'N':-1,'D':3,'C':-3,'Q':-1,'E':1,'G':0,'H':-2,'I':-2,'L':-1,'K':0,'M':-2,'F':1,'P':0,'S':1,'T':-1,'W':-3,'Y':-2,'V':-2},
{'X':0,'U':0,'A':-2,'R':-2,'N':0,'D':5,'C':-3,'Q':-1,'E':2,'G':1,'H':-2,'I':-3,'L':-3,'K':-1,'M':1,'F':-4,'P':-2,'S':1,'T':1,'W':-4,'Y':-3,'V':-3},
{'X':0,'U':0,'A':-4,'R':-4,'N':-5,'D':-6,'C':-4,'Q':-4,'E':-5,'G':5,'H':-3,'I':-4,'L':-3,'K':-5,'M':-3,'F':4,'P':-5,'S':-4,'T':-4,'W':11,'Y':5,'V':-4},
{'X':0,'U':0,'A':-2,'R':-2,'N':-2,'D':2,'C':-3,'Q':1,'E':2,'G':-3,'H':-2,'I':1,'L':1,'K':-1,'M':-1,'F':-2,'P':-3,'S':-1,'T':1,'W':-3,'Y':-2,'V':2},
{'X':0,'U':0,'A':0,'R':-1,'N':-2,'D':0,'C':-2,'Q':-1,'E':1,'G':-2,'H':0,'I':1,'L':0,'K':0,'M':2,'F':1,'P':0,'S':0,'T':1,'W':-3,'Y':-1,'V':1},
{'X':0,'U':0,'A':-2,'R':-4,'N':-4,'D':-5,'C':-2,'Q':-4,'E':-4,'G':-5,'H':-4,'I':4,'L':4,'K':-4,'M':1,'F':-1,'P':-4,'S':-3,'T':-2,'W':-3,'Y':-2,'V':3}]

def pssm_calculator(seq):
    """Calculates PSSM scores of AIM sequences"""
    PSSM_score=0
    for i in range(5, 11):
        PSSM_score += pssm_kalvari[i-5][seq[i]]
    return PSSM_score
## end legacy PSSM


# Sequence padding functions
def left_pad_sequence(seq, total_length=10, pad_char='X'):
    return seq.rjust(total_length, pad_char)

def right_pad_sequence(seq, total_length=10, pad_char='X'):
    return seq.ljust(total_length, pad_char)


# Import necessary libraries and models

# Code computing the legacy PSSM score (Kalvari et al., 2014) and the pLIRm score (Han et al., 2021).

Run all cells below.

In [5]:
# Code adopted by pLIRm
# https://github.com/BioCUCKOO/pLIRm-pLAM

BLOSUM62_file = open("./workdir/BLOSUM62R.txt",'r')
BLOSUM62_lines = BLOSUM62_file.readlines()
BLOSUM62_dic = {}
name_list = BLOSUM62_lines[0].split()
value_list = BLOSUM62_lines[1].split()
for i in range(1,len(name_list)):
    BLOSUM62_dic[name_list[i]] = value_list[i]
BLOSUM62_file.close()

In [6]:
def encode_pep_single(pep,p_list, weight_array):
    if weight_array == None:
        weight_array = []
        for i in range(len(p_list[0])):
            weight_array.append(1)
    data = []

    standard_dic = {'A':0,'R':1,'N':2,'D':3,'C':4,'Q':5,'E':6,'G':7,'H':8,'I':9,'L':10,'K':11,'M':12,'F':13,'P':14,'S':15,'T':16,'W':17,'Y':18,'V':19,'B':20,'Z':21,'X':22,'U':23}
    conposition_list = []
    for i in range(18):
        conposition_list.append({'A':0,'R':0,'N':0,'D':0,'C':0,'Q':0,'E':0,'G':0,'H':0,'I':0,'L':0,'K':0,'M':0,'F':0,'P':0,'S':0,'T':0,'W':0,'Y':0,'V':0,'B':0,'Z':0,'X':0,'U':0})
    tot_num = 0
    for pos_seq in p_list:
        if len(pos_seq) != 18:
            continue
        else:
            tot_num += 1
            for i in range(len(pos_seq)):
                if pos_seq[i].upper() not in standard_dic.keys():
                    AA = 'U'
                else:
                    AA = pos_seq[i]
                conposition_list[i][AA] += 1
    temp_score = [[0 for col in range(24)] for length in range(24)]
    C_tot = [[0 for col in range(24)] for length in range(24)]
    for j in range(0, len(pep)):
        AA = pep[j]
        for key in conposition_list[j].keys():

            value = conposition_list[j][key]
            C_tot[standard_dic[AA]][standard_dic[key]]+= value

            key1 = AA+key
            key2 = key+AA
            if key1 in BLOSUM62_dic.keys():
                temp_score[standard_dic[AA]][standard_dic[key]] += value*int(BLOSUM62_dic[key1])*weight_array[j]
            elif key2 in BLOSUM62_dic.keys():
                temp_score[standard_dic[AA]][standard_dic[key]] += value*int(BLOSUM62_dic[key2])*weight_array[j]

    temp_code = []
    for row in range(24):
        for col in range(row,24):
            if row == col:
                if(temp_score[row][col] != 0):
                    temp_code.append(temp_score[row][col])
                else:
                    temp_code.append(0)
            else:
                if(temp_score[row][col]+temp_score[col][row] != 0):
                    temp_code.append((temp_score[row][col]+temp_score[col][row]))
                else:
                    temp_code.append(0)

    return(temp_code)


def load_data():
    p_file = open('./workdir/data_set.txt', 'r')
    posi = []
    for line in p_file.readlines():
        posi.append(line.strip())
    p_file.close()
    return posi

def load_trained(file):
    weight = []
    matrix = []
    matrix.append([])
    model = open(file, 'r')
    line = model.readline()
    for value in line.split("\t"):
        weight.append(float(value))
    for values in model.readlines():
        values = values.strip()
        if not values.startswith('~'):
            row_matrix = []
            for value in values.split("\t"):
                row_matrix.append(float(value))
            matrix[0].append(row_matrix)
        else:
            matrix.append(float(''.join(list(values)[1:])))
    matrix_array = []
    for i in range(len(matrix[0])):
        for j in range(i,len(matrix[0][i])):
            matrix_array.append(matrix[0][i][j])
    matrix_array.append(matrix[1])
    return [weight, matrix_array]

data_set = load_data()
trained_model = load_trained(r'./workdir/trained_model.txt')


def predict_s(pep):
    p_list = data_set
    weight_array = trained_model[0]
    matrix_array = trained_model[1]
    pep_encode = encode_pep_single(pep,p_list,weight_array)
    sim_score=0.0
    for j in range(len(pep_encode)):
        sim_score += pep_encode[j]*matrix_array[j]
    sim_score += matrix_array[-1]
    return (sim_score)


In [7]:
# LIRcentral-group code here
def calculate_pLIRm_score(seq):
  '''
  Sequence should be 7+4+7 residues long (upstream+core+downstream)
  When upstream/downstream sequences are shorter they should be padded using
  left_pad_sequence(seq, total_length=7, pad_char='X')
  or
  right_pad_sequence(seq, total_length=7, pad_char='X')
  respectively before calling calculate_pLIRm_score
  '''
  pLIRm_score = predict_s(seq)
  return(pLIRm_score)


# Predictor imports and functions

Run the cells below and proceed to the next section for performing a prediction.

In [8]:
# Import necessary libraries
import pandas as pd
import numpy as np
import joblib
from sklearn.preprocessing import StandardScaler
import ipywidgets as widgets
from IPython.display import display
from statistics import mean

# Load the pre-trained model and scaler
stacking_classifier_w_plirm = joblib.load('./workdir/stacking_classifier_w_pLIRm.pkl')
scaler_w_plirm = joblib.load('./workdir/scaler_w_pLIRm.pkl')

stacking_classifier_AFdis_pssm_newPSSMs = joblib.load('./workdir/stacking_classifier_AFdis_pssm_newPSSMs.pkl')
stacking_scaler_AFdis_pssm_newPSSMs = joblib.load('./workdir/stacking_scaler_AFdis_pssm_newPSSMs.pkl')

In [9]:
# Test the predictor
def test_predictor():

  df = pd.DataFrame()
  df['(-2)LIR PSSM score'] = [-1]
  df['AF2-disorder-avgscores-core']=[0.7]
  df['AF2-disorder-avgscores-upstream']=[0.8]
  df['AF2-disorder-avgscores-downstream']=[0.6]
  df['PSSM_LIRcentral-core']=[15]
  df['PSSM_LIRcentral-upstream']=[2]
  df['PSSM_LIRcentral-downstream']=[-1]
  df['nPSSM_LIRcentral-core']=[1]
  df['nPSSM_LIRcentral-upstream']=[-3]
  df['nPSSM_LIRcentral-downstream']=[-1]

  features = stacking_scaler_AFdis_pssm_newPSSMs.transform(df)
  classes = {1:"Functional", 0:"Non-functional"}
  prediction = stacking_classifier_AFdis_pssm_newPSSMs.predict(features)
  predicted_class = prediction[0]
  print(f"Predicted Class: {classes[predicted_class]}")

  df['pLIRm-score']=[1]
  features=scaler_w_plirm.transform(df)
  prediction=stacking_classifier_w_plirm.predict(features)
  predicted_class = prediction[0]
  print(f"Predicted Class: {classes[predicted_class]}")

# For debugging purpposes only
debug = 0
if debug:
  test_predictor()


In [12]:
def predict_protein(change):
    with output:
        output.clear_output()
        # Get the input sequence
        uniprot_id = sequence_input.value
        if not uniprot_id:
            print("Please enter a UniProt Identifier.")
            return

        try:
          json_data =query_mobidb(uniprot_id)
          if json_data:
                disorder_values = extract_alphafold_disorder(json_data)
                if disorder_values:
                    sequence = extract_sequence(json_data)
                    if sequence:
                      if len(sequence) != len (disorder_values):
                        print('Disordered scores do not match the given sequence')
                    else:
                      print(f"Failed to retrieve sequence for UniProt ID: {uniprot_id}")
                else:
                    print(f"No AlphaFold disorder values for UniProt ID: {uniprot_id}")
          else:
                print(f"Failed to retrieve results for UniProt ID: {uniprot_id}")

        except:
          print(f"Invalid UniProt ID: {uniprot_id}? Cannot obtain AlphaFold-disorder data.")

        # If we have reached this point then
        # sequence and disorder_values were
        # retrieved and read successfully

        # Now find all putative LIR motifs ([WFY]xx[VLI])
        # with overlaps allowed
        LIR_start = []
        for i in range(0,len(sequence)):
          if sequence[i] not in 'WFY':
            continue
          else:
            if sequence[i+3] in 'VLI': # We have a match to a canonical LIR motif
              LIR_start.append(i)

        # Prepare all features for each motif
        for start_pos in LIR_start:
          end_pos=start_pos+3
          df = pd.DataFrame()

          core = sequence[start_pos:end_pos+1] # Core motif sequence
          # Sequence required by pLIRm and legacy PSSM
          window_len=7
          start_pos_up = max(0,start_pos-window_len)
          end_pos_up = max(0,start_pos-1)
          ups7 = sequence[start_pos_up:end_pos_up+1] # Upstream-7
          start_pos_down = min(end_pos+1, len(sequence)-1)
          end_pos_down = min(end_pos+window_len, len(sequence)-1)
          dns7 = sequence[start_pos_down:end_pos_down+1] # Downstream-7
          up_seq = left_pad_sequence(ups7,total_length=7)
          down_seq = right_pad_sequence(dns7,total_length=7)
          seq_pLIRm = up_seq + core + down_seq
          df['(-2)LIR PSSM score'] = [pssm_calculator(seq_pLIRm)]
          # pLIRm feature will not be used as it does not increase accuracy
          df['pLIRm-score'] = calculate_pLIRm_score(seq_pLIRm)

          # Seqeuence required by new PSSM
          window_len=10
          start_pos_up = max(0,start_pos-window_len)
          end_pos_up = max(0,start_pos-1)
          ups10 = sequence[start_pos_up:end_pos_up+1] # Upstream-10
          start_pos_down = min(end_pos+1, len(sequence)-1)
          end_pos_down = min(end_pos+window_len, len(sequence)-1)
          dns10 = sequence[start_pos_down:end_pos_down+1] # Downstream-10

          core = left_pad_sequence(core,total_length=14)
          core = right_pad_sequence(core,total_length=24)
          df['PSSM_LIRcentral-core']= pssm_score(core, pssm_pos)
          df['nPSSM_LIRcentral-core']= pssm_score(core, pssm_neg)
          up_seq = left_pad_sequence(ups10,total_length=10)
          up_seq = right_pad_sequence(up_seq,total_length=24)
          df['PSSM_LIRcentral-upstream']=pssm_score(up_seq,pssm_pos)
          df['nPSSM_LIRcentral-upstream']=pssm_score(up_seq,pssm_neg)
          down_seq = right_pad_sequence(dns10,total_length=10)
          down_seq = left_pad_sequence(down_seq,total_length=24)
          df['PSSM_LIRcentral-downstream']=pssm_score(down_seq,pssm_pos)
          df['nPSSM_LIRcentral-downstream']=pssm_score(down_seq,pssm_neg)

          # For calculating AFdisorder parameters
          window_len=20
          values_list_core = disorder_values[start_pos:end_pos+1]
          df['AF2-disorder-avgscores-core'] = mean(values_list_core)
          start_pos_up = max(0,start_pos-window_len)
          end_pos_up = max(0,start_pos-1)
          values_list_up = disorder_values[start_pos_up:end_pos_up+1]
          df['AF2-disorder-avgscores-upstream'] = mean(values_list_up)
          start_pos_down = min(end_pos+1, len(sequence)-1)
          end_pos_down = min(end_pos+window_len, len(sequence)-1)
          values_list_down = disorder_values[start_pos_down:end_pos_down+1]
          df['AF2-disorder-avgscores-downstream'] = mean(values_list_up)

          ordered_features =[
              '(-2)LIR PSSM score', 'AF2-disorder-avgscores-core',
              'AF2-disorder-avgscores-upstream', 'AF2-disorder-avgscores-downstream',
              'PSSM_LIRcentral-core','PSSM_LIRcentral-upstream', 'PSSM_LIRcentral-downstream',
              'nPSSM_LIRcentral-core', 'nPSSM_LIRcentral-upstream', 'nPSSM_LIRcentral-downstream','pLIRm-score'
              ]
          df_ord=df[ordered_features]
          classes = {1:"Functional", 0:"Non-functional"}
          features=scaler_w_plirm.transform(df_ord)
          prediction=stacking_classifier_w_plirm.predict(features)
          predicted_class = prediction[0]
          probabilities =stacking_classifier_w_plirm.predict_proba(features)
          class_probabilities = probabilities[0]
          # features = stacking_scaler_AFdis_pssm_newPSSMs.transform(df_ord)
          # prediction = stacking_classifier_AFdis_pssm_newPSSMs.predict(features)
          # predicted_class = prediction[0]
          # probabilities = stacking_classifier_AFdis_pssm_newPSSMs.predict_proba(features)
          # predicted_class2 = np.argmax(probabilities, axis=1)[0]
          # class_probabilities = probabilities[0]  # Probabilities for the first (and only) sample

          print(f'{uniprot_id} {start_pos}', end='\t')
          print(f'{sequence[start_pos:start_pos+4]}', end='\t')
          print("PSSM:", df['(-2)LIR PSSM score'][0], end='\t')
          print(f"pLIRm {calculate_pLIRm_score(seq_pLIRm)}",end='\t')
          print(f"Pred: {classes[predicted_class]}", end="\t")
          #print(f"Prediction2: {predicted_class2}", end="\t")
          print(f"Prob: {class_probabilities}")


# Enter data for prediction

In [13]:
  # Create a form to enter a protein sequence
  # Currently, a valid UniProt identifier is expected
  sequence_input = widgets.Textarea(value='',placeholder='Enter UniProt ID here',description='UniProt ID:',disabled=False)
  display(sequence_input)# Display the form
  button = widgets.Button(description="Predict") # Create a button to trigger prediction
  output = widgets.Output()
  button.on_click(predict_protein) # Attach the prediction function to the button click event
  display(button, output) # Display the button and output area


Textarea(value='', description='UniProt ID:', placeholder='Enter UniProt ID here')

Button(description='Predict', style=ButtonStyle())

Output()

# Acknowledgements

This work has been possible through a grant awarded to the [Bioinformatics Research Laboratory](https://vprobon.github.io/BRL-UCY) at the [University of Cyprus](https://www.ucy.ac.cy) for the [LIRcentral project](https://lircentral.eu/).

LIRcentral is co-funded by the European Union (European Regional Development Fund, ERDF) and the Republic of Cyprus through the project EXCELLENCE/0421/0576 under the EXCELLENCE HUBS programme of the [Cyprus Research and Innovation Foundation](https://research.org.cy).

![picture](https://lircentral.eu/images/LIRcentral-FundedBy.png)


For the development of iLIR-ML-v0.9 a number of publicly available resources were/are used.

- Machine learning modules are based on the excellent [sciKit-learn](https://scikit-learn.org/) Python toolkit.

- For the creation of features for representing candidate LIR motifs for predictions the following tools/resources are intrumental:

> - The [MobiDB database](https://mobidb.bio.unipd.it/) (Piovesan et al., 2020) provides precomputed intrinsic disorder prediction based on the AlphaFold-disorder method (Piovesan et al., 2022) for select UniProt entries.
> - The pLIRm software (freely available online at [GitHub](https://github.com/BioCUCKOO/pLIRm-pLAM), which we have tailored to our pipeline for computing the pLIRm score as an additional predictive feature for LIR motifs. We are indebted to the authors of this work for sharing their work.
> - The 'legacy' PSI-BLAST-derived PSSMs from previous work in our lab (Kalvari et al., 2014) ported in Python by undergraduate student Dimitris Kalanides.
>- Newly derived PSSMs (LIRcentral-PSSMs), are based on the more recently updated version of the LIRcentral database (Chatzichristofi et al., 2023).


Last, but not least, there is a huge amount of work held by official and unofficial members of the LIRcentral team, who developed tools for assisting LIRcentral biocuration, for curating LIRcentral entries from the published literature, for exploring properties of the LIRcentral data. In addition, we are grateful to several experts in autophagy who have provided feedback on existing LIRcentral entries and suggestions for adding new intances of LIR motifs in the database. We intend to keep LIRcentral, its data, and software tools derived from analysing these data freely available to the research community. We hope this work inspire and help others to work on this/similar problem(s).




# Additional information



## Notes
The iLIR-ML method takes as input a UniProt identifier. It then retrieves the respective protein sequence and associated AlphaFold-disorder prediction scores from the MobiDB database (if available).  

**Warning:** Not all UniProt entries have available AlphaFold-disorder data, so (unfortunately) iLIR-ML predictions for these entries will fail. We are currently working on fixing this issue, which mainly has to do with the compute limitations for running AlphaFold2 at scale.

Sequences suitable for prediction are then scanned for all occurences of candidate LIR motifs, based on the degenerate regualar expression pattern ```[WFY]xx[VLI]```. Overlaps are allowed.

Based on the protein sequence, a number of features are calculated representing characteristic sequence properties in the vicinity of LIR motifs (e.g., pLIRm score, PSSMs), which are then used for predicting whether a candidate motif is functional (i.e., binds Atg8/LC3/GABARAP) or not.



## Limitations

In addition to technical limitations mentioned above, the major scientific limitation of iLIR-ML lies on the inherent inability to identify non-cannonical LIR motifs.  Previous work has suggested that such cases can only be revealed via structural/complex predictions (Ibrahim et al., 2023; Kołodziej et al., 2023; Zeke et al., 2024).

## Contact

For any scientific or technical inquiries, feel free to contact us via email (promponas.vasileios [at] ucy.ac.cy).


##References
- Chatzichristofi A, Sagris V, Pallaris A, Eftychiou M, Kalvari I, Price N, Theodosiou T, Iliopoulos I, Nezis IP, Promponas VJ. LIRcentral: a manually curated online database of experimentally validated functional LIR motifs. Autophagy. 2023 Dec;19(12):3189-3200. doi: 10.1080/15548627.2023.2235851. Epub 2023 Aug 2. PMID: 37530436; PMCID: PMC10621281.

- Han Z, Zhang W, Ning W, Wang C, Deng W, Li Z, Shang Z, Shen X, Liu X, Baba O, Morita T, Chen L, Xue Y, Jia D. Model-based analysis uncovers mutations altering autophagy selectivity in human cancer. Nat Commun. 2021 May 31;12(1):3258. doi: 10.1038/s41467-021-23539-5. PMID: 34059679; PMCID: PMC8166871.


- Ibrahim T, Khandare V, Mirkin FG, Tumtas Y, Bubeck D, Bozkurt TO. AlphaFold2-multimer guided high-accuracy prediction of typical and atypical ATG8-binding motifs. PLoS Biol. 2023 Feb 8;21(2):e3001962. doi: 10.1371/journal.pbio.3001962. PMID: 36753519; PMCID: PMC9907853.

- Kalvari I, Tsompanis S, Mulakkal NC, Osgood R, Johansen T, Nezis IP, Promponas VJ. iLIR: A web resource for prediction of Atg8-family interacting proteins. Autophagy. 2014 May;10(5):913-25. doi: 10.4161/auto.28260. Epub 2014 Feb 26. PMID: 24589857; PMCID: PMC5119064.

- Kołodziej M, Tsapras P, Konstantinou A,  Cameron AD, Promponas V,  Nezis IP. Deformed wings is an Atg8a-interacting protein that negatively regulates autophagy. bioRxiv. 2023;
doi: https://doi.org/10.1101/2023.02.03.526972

- Zeke A, Gibson TJ, Dobson L. Linear motifs regulating protein secretion, sorting and autophagy in Leishmania parasites are diverged with respect to their host equivalents. PLoS Comput Biol. 2024 Feb 16;20(2):e1011902. doi: 10.1371/journal.pcbi.1011902. PMID: 38363808; PMCID: PMC10903960.