Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate diseases/phenotypes using chatGPT #19

Closed
7 tasks done
bschilder opened this issue Mar 20, 2023 · 18 comments
Closed
7 tasks done

Annotate diseases/phenotypes using chatGPT #19

bschilder opened this issue Mar 20, 2023 · 18 comments
Assignees
Labels
enhancement New feature or request

Comments

@bschilder
Copy link
Contributor

bschilder commented Mar 20, 2023

(checked boxes indicate at least an initial attempt has been made)

Annotations

  • Severity score (without criterion).
  • Severity score (using Lazarin 2014 (table 2) criteria)
  • Childhood onset
  • Causes death

Models

  • chatGPT (gpt-3.5)
  • chatGPT (gpt-4) (pending payment details from @NathanSkene)
  • bioGPT

Related

Some of my initial attempts are documented within this R package:
https://github.com/neurogenomics/gptPhD

@KittyMurphy once you have a chance please report your progress here. I'll do the same.

@bschilder bschilder changed the title Annotate disease severity using chatGPT Annotate diseases/phenotypes using chatGPT Mar 20, 2023
@bschilder
Copy link
Contributor Author

@KittyMurphy please document your progress on this here

@KittyMurphy
Copy link
Contributor

KittyMurphy commented Mar 26, 2023

Annotating HPO phenotypes using chatGPT via gptstudio

Set up

install.packages("gptstudio")
library(gptstudio)

# Load HPO terms 
terms_dt = HPOExplorer::load_phenotype_to_genes(3)
terms_cols = list(name="Phenotype",
                  id="ID")

# Get unique terms and their ID's 
terms_dt_sub <.- unique(terms_dt[,unname(unlist(terms_cols)), with=FALSE])

Attempt #1

Here I'm using the congenital onset terms (without HPO ID) that were provided to us by Peter Robinson. Will also try:

  • inputting HPO ID into prompt
  • asking chatGPT to add column with HPO ID
# congenital onset terms without HPO ID
congenital_onset <- "Syndactyly; 
Ventricular septal defect; Atrioventricular canal defect; 
Atrial septal defect; Abnormal connection of the cardiac segments; 
Fetal anomaly; Neural tube defect; 
Coloboma; Microtia; Cryptotia; 
Cupped ear; Cleft helix; Low-set ears; 
Synotia; Holoprosencephaly; Exstrophy; 
Abdominal wall defect; Abnormal lung lobation; 
Unilateral primary pulmonary dysgenesis"

# define the effects you need answers to e.g. does the phenotype cause death
effects <- "mental retardation, death, impaired mobility, 
physical malformations, blindness, sensory impairments, 
immunodeficiency, cancer, reduced fertility."

# define the columns of the output table 
table_columns <- "phenotype, mental retardation, death, impaired mobility,
physical malformations, blindness, sensory impairments, immunodeficiency, cancer, 
reduced fertility, congenital onset, jusitification."

# define chatGPT prompt
question = paste("Do:", 
                 congenital_onset, 
                 ", typically cause:",
                 effects, 
                 "Do they have congenital onset?",
                 "You must give one-word yes or no answers and give a justification for why they do or don't have congenital onset.",
                 "You must provide the output in .tsv format with columns:",
                 table_columns)
question <- gsub("\n", "", question)

# run chatgpt 5 times for the same prompt
n = 5
run_chatgpt <- function(q){
  all_res <- gptstudio::openai_create_chat_completion(prompt = question)
  choices <- fread(all_res[["choices"]]$message.content)
  }

res_allPheno <- lapply(seq_len(n), function(x) run_chatgpt(1))

res_allPheno_dt <- data.table::rbindlist(res_list,fill = TRUE,
                                        use.names = TRUE,
                                        idcol = "iteration")

# order alphabetically so that you can compare results across phenotypes
res_allPheno_dt <- res_allPheno_dt [order(res_allPheno_dt $phenotype), ]

Below is a subset of res_allPheno_dt. The answers chatGPT gives over iterations of the same prompt are not consistent e.g. look at mental retardation for coloboma. A coloboma is an area of missing tissue in your eye, and through a quick google search is not associated with mental retardation.

iteration phenotype mental retardation death impaired mobility physical malformations blindness sensory impairments immunodeficiency cancer reduced fertility congenital onset justification
1 Atrioventricular canal defect Yes Yes Yes Yes No No No No No Yes Congenital heart defect present at birth
2 Atrioventricular canal defect Yes, in some cases May lead to premature death no May lead to growth failure, fatigue or rapid breathing May lead to vision problems None None None No AV canal defect is present at birth and is a congenital condition.  
3 Atrioventricular canal defect Yes Yes No Yes No No No No No Yes Atrioventricular canal defect is a congenital heart defect in which there is an opening in the center of the heart where the walls separating the heart chambers should be.
4 Atrioventricular canal defect Yes Possible None Physical malformations No No No No No Yes Congenital onset is typical of this phenotype as it is a result of abnormal development of the heart during fetal development.
5 Atrioventricular canal defect Yes Yes No Yes No No No No No Yes It is a congenital heart defect that is present at birth.
1 Cleft helix No No No Yes No No No No No Yes Congenital ear malformation present at birth
2 Cleft helix No None None May lead to physical malformations of the ear None None None None Yes Cleft helix is present at birth and is a congenital condition.  
3 Cleft helix No No No Yes No No No No No Yes Cleft helix is a congenital anomaly characterized by a cleft or gap in the top part of the ear.
4 Cleft helix No None None Physical malformations No No No No No Yes Congenital onset is typical of this phenotype as it is a result of incomplete development of the ear during fetal development.
5 Cleft helix No No No Yes No No No No No Yes A cleft helix is a rare congenital malformation of the ear.
1 Coloboma Yes No No Yes Yes Yes No No No Yes Present at birth and can affect vision and eye structure
2 Coloboma No May lead to vision problems or blindness May depend on location on the body None May lead to vision problems or blindness May lead to hearing loss or deafness None None No Coloboma is present at birth and is a congenital condition.  
3 Coloboma Yes No No Yes Yes Yes No No No Yes Coloboma is a congenital anomaly characterized by a gap or hole in one of the structures of the eye.
4 Coloboma No None None Physical malformations Possible Possible No No No Yes Congenital onset is typical of this phenotype as it is a result of incomplete fusion of the tissues that form the eye during fetal development.
5 Coloboma Yes No No Yes Yes No No No No Yes A coloboma is a birth defect that affects the eye.
1 Cryptotia No No No Yes No No No No No Yes Congenital ear malformation present at birth
2 Cryptotia No None None May lead to physical malformations of the ear None None None None Yes Cryptotia is present at birth and is a congenital condition.  
3 Cryptotia No No No Yes No No No No No Yes Cryptotia is a congenital anomaly characterized by a hidden ear that is partially or completely covered by skin.
4 Cryptotia No None None Physical malformations No No No No No Yes Congenital onset is typical of this phenotype as it is a result of abnormal development of the ear during fetal development.
5 Cryptotia No No No Yes No No No No No Yes Cryptotia is a congenital ear deformity.
1 Cupped ear No No No Yes No No No No No Yes Congenital ear malformation present at birth
2 Cupped ear No None None May lead to physical malformations of the ear None None None None Yes Cupped ear is present at birth and is a congenital condition.  
3 Cupped ear No No No Yes No No No No No Yes Cupped ear is a congenital anomaly characterized by an ear that is shaped like a cup and protrudes outward from the side of the head.
4 Cupped ear No None None Physical malformations No No No No No Yes Congenital onset is typical of this phenotype as it is a result of abnormal development of the ear during fetal development.
5 Cupped ear No No No Yes No No No No No Yes A cupped ear is a congenital malformation.
1 Exstrophy Yes No Yes Yes No No No No No Yes Present at birth and affects bladder and pelvic development
2 Exstrophy No None None May lead to physical malformations of the abdominal wall or pelvic organs None None None May lead to reduced fertility Yes Exstrophy is present at birth and is a congenital condition.  
3 Exstrophy Yes No Yes Yes No No No No No Yes Exstrophy is a congenital anomaly characterized by a defect in the abdominal wall or bladder.
4 Exstrophy No None None Physical malformations No No No No No Yes Congenital onset is typical of this phenotype as it is a result of abnormal development of the abdominal wall during fetal development.
5 Exstrophy Yes No Yes Yes No No No No No Yes Exstrophy is a congenital abnormality where the bladd

Attempt #2

What if I run the prompt one phenotype at a time, with 3 iterations?

congenital_onset_split <- as.list(strsplit(congenital_onset, "; ")[[1]])

results_list <- list() 

for (j in 1:3) { 
  res_individualPheno <- lapply(seq_len(length(congenital_onset_split)), function(i){
    pheno <- congenital_onset_split[[i]]
    question = paste("Does",
                     pheno, 
                     "typically cause:", 
                     effects,
                     "Does",
                     pheno, 
                     "have congenital onset?",
                     "You must give one-word yes or no answers and give a justification for why it does or doesn't have congenital onset.",
                     "You must provide the output in .tsv format with columns:",
                     table_columns)
    question <- gsub("\n", "", question)
    print(question)
    all_res <- gptstudio::openai_create_chat_completion(prompt = question)
    choices <- fread(all_res[["choices"]]$message.content)
    return(choices)
  })
  results_list[[j]] <- res_individualPheno_list 
}


list <- unlist(res_individualPheno_list, recursive = FALSE)

res_individualPheno_dt <- data.table::rbindlist(list,fill = TRUE,
                                         use.names = TRUE,
                                         idcol = "iteration")

# order alphabetically so that you can compare results across phenotypes
res_individualPheno_dt <- res_individualPheno_dt[order(res_individualPheno_dt$phenotype), ]

Below is a subset of res_individualPheno_dt, I've shown the same phenotypes as for res_allPheno_dt for comparison. There seems to be more consistency across the iterations when you run chatgpt on each phenotype individually.

phenotype mental retardation death impaired mobility physical malformations blindness sensory impairments immunodeficiency cancer reduced fertility congenital onset justification justification
Atrioventricular canal defect no no no yes no no no no no yes NA Defect occurs during fetal development, therefore present at birth.
Atrioventricular canal defect No No No Yes No No No No No Yes NA Atrioventricular canal defect is a congenital heart defect. It is present at birth and develops as the heart forms during fetal development.
Atrioventricular canal defect No No No Yes No No No No No Yes NA Atrioventricular canal defect is a congenital heart defect that occurs during fetal development.
Cleft helix No No No Yes No No No No No Yes NA Cleft helix is a genetic condition that is present at birth, thus indicating that it has a congenital onset.
Cleft helix No No No Yes No No No No No Yes NA Cleft helix is a genetic condition, meaning it is present at birth and caused by inherited gene mutations. It is a congenital condition.
Cleft helix No No No Yes No No No No No Yes NA Congenital onset is indicated by the presence of a physical malformation at birth, which is true for cleft helix.
Coloboma No No No Yes Yes Yes No No No Yes NA Congenital onset means present at birth, and coloboma is a congenital condition that occurs when certain structures in the eye or other parts of the body don't develop properly during fetal growth. Therefore, it has a congenital onset.
Coloboma No No No Yes Yes Yes No No No Yes NA Congenital onset refers to a condition that is present at or before birth. Coloboma is a congenital condition, as it occurs when the eye doesn't develop properly during pregnancy.
Coloboma no no no yes yes yes no no no yes NA Coloboma is a congenital birth defect that affects the eyes, and it is usually present from birth. It is caused by abnormal development of the eye during gestation.
Cryptotia No No No Yes No No No No No Yes NA Cryptotia is a congenital ear anomaly.
Cryptotia No No No Yes No No No No No Yes NA Cryptotia is a congenital ear malformation that is present at birth.
Cryptotia No No No Yes No No No No No Yes NA Cryptotia is a congenital condition, meaning it is present at or before birth.
Cupped ear No No No Yes No No No No No Yes NA Cupped ear is associated with physical malformations and is present at birth (congenital).
Cupped ear no no no yes no no no no no yes NA The development of an ear occurs during fetal development, hence the onset of cupped ear is congenital.
Cupped ear No No No Yes No No No No No Yes NA It is a congenital deformity that occurs during fetal development.
Exstrophy No No Yes Yes No No No No Yes Yes NA It is a birth defect that occurs during fetal development.
Exstrophy No No Yes Yes No No No Yes Yes Yes NA Exstrophy is a congenital anomaly that occurs during fetal development. The anterior body wall fails to properly fuse together, resulting in the exposure of internal organs.
Exstrophy No No Yes Yes No No No No Yes Yes NA Consequence of abnormal embryonic development

Attempt #3

Here I'm repeating attempt #1 with the addition of providing chatGPT with the definition of each congenital onset term.

# make dataframe with congenital onset phenotypes and their IDs, match column names to those in hpo meta 
congenital_onset_dt <- data.table(preferredlabel = c("Syndactyly",
                            "Ventricular septal defect",
                            "Atrioventricular canal defect",
                            "Atrial septal defect",
                            "Abnormal connection of the cardiac segments",
                            "Fetal anomaly",
                            "Neural tube defect",
                            "Coloboma",
                            "Microtia",
                            "Cryptotia",
                            "Cupped ear",
                            "Cleft helix",
                            "Low-set ears",
                            "Synotia",
                            "Holoprosencephaly",
                            "Exstrophy",
                            "Abdominal wall defect",
                            "Abnormal lung lobation",
                            "Unilateral primary pulmonary dysgenesis"),
                   HPO_ID = c("HP:0001159",
                          "HP:0001629",
                          "HP:0006695",
                          "HP:0001631",
                          "HP:0011545",
                          "HP:0034057",
                          "HP:0045005",
                          "HP:0000589",
                          "HP:0008551",
                          "HP:0011252",
                          "HP:0000378",
                          "HP:0009902",
                          "HP:0000369",
                          "HP:0100663",
                          "HP:0001360",
                          "HP:0100548",
                          "HP:0010866",
                          "HP:0002101",
                          "HP:0006549"))


# get HPO metadata table for all descendant terms of 'phenotypic abnormality' 
hpo_meta <- HPOExplorer::make_phenos_dataframe("HP:0000118")

# get meta info for congenital onset phenotypes
congenital_onset_dt <- merge(congenital_onset_dt, hpo_meta)

# phenos + definition for prompt, note that some don't have a definition in the hpo_meta table
phenos <- paste(
  paste0(congenital_onset_dt[[1]],
         " - ",congenital_onset_dt[[7]]),
  collapse="; "
) 

phenos <- gsub("\"\"","'", phenos)

# define chatGPT prompt
question = paste("Do:", 
                 phenos, 
                 ", typically cause:",
                 effects, 
                 "Do they have congenital onset?",
                 "You must give one-word yes or no answers and give a justification for why they do or don't have congenital onset.",
                 "You must provide the output in .tsv format with columns:",
                 table_columns)
question <- gsub("\n", "", question)

# run chatgpt 5 times for the same prompt
n = 5
run_chatgpt <- function(q){
  all_res <- gptstudio::openai_create_chat_completion(prompt = question)
  choices <- fread(all_res[["choices"]]$message.content)
}

res_multiPheno_def <- lapply(seq_len(n), function(x) run_chatgpt(1))

res_multiPheno_def_dt <- data.table::rbindlist(res_multiPheno_def,fill = TRUE,
                                         use.names = TRUE,
                                         idcol = "iteration")

# order alphabetically so that you can compare results across phenotypes
res_multiPheno_def_dt <- res_multiPheno_def_dt[order(res_multiPheno_def_dt$phenotype), ]

Here is a subset of res_multiPheno_def_dt. Including the definition in the prompt seems to: (i) improve consistency in results but (ii) reduces accuracy e.g. coloboma doesn't seem to be associated with mental retardation, and Atrioventricular canal defect does not 'typically' cause if there is surgical intervention (see below the table for a more detailed answer for this phenotype from chatGPT).

iteration phenotype mental retardation death impaired mobility physical malformations blindness sensory impairments immunodeficiency cancer reduced fertility congenital onset justification
1 Atrioventricular canal defect Yes Yes No Yes No No No No No Yes Present at birth (congenital).
2 Atrioventricular canal defect Yes Yes Yes Yes No No No No No Yes This condition is present at birth and affects the heart.
3 Atrioventricular canal defect Yes Yes No Yes No No No No No Yes This is a defect in the atrioventricular septum of the heart which is a congenital defect.
4 Atrioventricular canal defect Yes Yes No Yes No No No No No Yes The term refers to a congenital heart defect that is present at birth (congenital).
5 Atrioventricular canal defect Yes Yes No Yes No No No No No Yes Congenital onset is specified in the definition.
1 Cleft helix No No No Yes No No No No No Yes Present at birth (congenital).
2 Cleft helix No No No Yes No No No No No Yes This is a congenital abnormality that affects the ear.
3 Cleft helix No No No Yes No No No No No Yes Cleft helix is a defect that is present since birth.
4 Cleft helix No No No Yes No No No No No Yes The term refers to a developmental defect of the helix of the ear that is present at birth (congenital).
5 Cleft helix No No No Yes No No No No No Yes Congenital onset is specified in the definition.
1 Coloboma Yes No No Yes Yes Yes No No No Yes Develops during fetal development and is present at birth (congenital).
2 Coloboma Yes No No Yes Yes No No No No Yes This is a developmental defect that is present at birth.
3 Coloboma Yes Yes No Yes Yes Yes No No No Yes Coloboma is a developmental defect that occurs during embryonic development.
4 Coloboma Yes No No Yes Yes Yes No No No Yes The term refers to a developmental defect of the eye that is present at birth (congenital).
5 Coloboma Yes No No Yes Yes Yes No No No Yes Congenital onset is specified in the definition.
1 Cryptotia No No Yes Yes No No No No No Yes Present at birth (congenital).
2 Cryptotia No No No Yes No No No No No Yes This is a congenital abnormality that affects the ear.
3 Cryptotia No No No Yes No No No No No Yes Cryptotia is present at birth.
4 Cryptotia No No No Yes No No No No No Yes The term refers to a developmental defect of the auricle of the ear that is present at birth (congenital).
5 Cryptotia No No No Yes No No No No No Yes Congenital onset is specified in the definition.
1 Cupped ear No No No Yes No No No No No Yes Present at birth (congenital).
2 Cupped ear No No No Yes No No No No No Yes This is a congenital abnormality that affects the ear.
3 Cupped ear No No No Yes No No No No No Yes This is a defect in ear folding which occurs during embryonic development.
4 Cupped ear No No No Yes No No No No No Yes The term refers to a developmental defect of the ear that is present at birth (congenital).
5 Cupped ear No No No Yes No No No No No Yes Congenital onset is specified in the definition.
1 Exstrophy Yes Yes Yes Yes No No No No No Yes Present at birth (congenital).
2 Exstrophy No No No Yes No No No No No Yes This is a developmental defect that is present at birth.
3 Exstrophy No No Yes Yes No No No No No Yes Exstrophy is a result of developmental defects in embryonic development.
4 Exstrophy No No Yes Yes No No No No No Yes The term refers to a developmental defect of the abdominal wall that is present at birth (congenital).
5 Exstrophy Yes Yes Yes Yes No No No No No Yes Congenital onset is specified in the definition.

Screenshot 2023-03-27 at 11 14 53 am

Attempt #4

Here I'm repeating attempt #2 with the addition of providing chatGPT with the definition of each congenital onset term.
results_list <- list()

for (j in 1:3) { 
res_indPheno_def <- lapply(seq_len(nrow(congenital_onset_dt)), function(i){
  pheno <- congenital_onset_dt$preferredlabel[[i]]
  definition <- congenital_onset_dt$definition[[i]]
  question <- paste("Does",
                    pheno, 
                    "-",
                    definition,
                    ", typically cause:", 
                    effects,
                    "Does",
                    pheno, 
                    "have congenital onset?",
                    "You must give one-word yes or no answers and give a justification for why it does or doesn't have congenital onset.",
                    "You must provide the output in .tsv format with columns:",
                    table_columns)
  question <- gsub("\n", "", question)
  question <- gsub(". , typically", ", typically", question)
  all_res <- gptstudio::openai_create_chat_completion(prompt = question)
  choices <- fread(all_res[["choices"]]$message.content)
})
results_list[[j]] <- res_indPheno_def
}

list <- unlist(results_list, recursive = FALSE)

res_indPheno_def_dt <- data.table::rbindlist(list,fill = TRUE,
                                                use.names = TRUE,
                                                idcol = "iteration")

# order alphabetically so that you can compare results across phenotypes
res_indPheno_def_dt <- res_individualPheno_dt[order(res_individualPheno_dt$phenotype), ]

Here is a subset of res_indPheno_def_dt.

iteration phenotype mental retardation death impaired mobility physical malformations blindness sensory impairments immunodeficiency cancer reduced fertility congenital onset justification justification
5 Atrioventricular canal defect No No No Yes No No No No No Yes Cause is a defect of the atrioventricular septum which develops during fetal development, making it congenital. NA
24 Atrioventricular canal defect No No No Yes No No No No No Yes Atrioventricular canal defect is a congenital heart defect, meaning it is present at birth. NA
43 Atrioventricular canal defect No Yes No Yes No No No No No Yes Atrioventricular canal defect is a congenital heart defect that is present at birth. NA
6 Cleft helix No No No Yes No No No No No Yes Cleft helix is a congenital malformation that occurs during fetal development. NA
25 Cleft helix No No No Yes No No No No No Yes Cleft helix is a physical malformation that is present at birth and affects the ear. NA
44 Cleft helix No No No Yes No No No No No Yes Cleft helix is a physical malformation of the ear that is present at birth, indicating a congenital onset. NA
7 Coloboma No No No Yes Yes No No No No Yes Coloboma is a congenital condition as it results from incomplete closure of the optic fissure during embryonic development, which occurs during the early stages of fetal development. NA
26 Coloboma No No No Yes Yes Yes No No No Yes Coloboma is a developmental defect that is present at birth, therefore it has a congenital onset. NA
45 Coloboma no no no yes yes yes no no no yes It is a developmental defect, meaning it occurs during fetal development and is present at birth. NA
8 Cryptotia No No No Yes No No No No No Yes Cryptotia is a congenital condition, meaning it is present at birth. It is caused by abnormal development of the ear during fetal development. NA
27 Cryptotia No No No Yes No No No No No Yes Cryptotia is a congenital anomaly caused by abnormal development of the auricle in utero. NA
46 Cryptotia No No No Yes No No No No No Yes Cryptotia is a congenital anomaly that develops during fetal growth and is present at birth. NA
9 Cupped ear No No No Yes No No No No No Yes Cupped ear is a physical malformation that is present at birth, thus it has a congenital onset. NA
28 Cupped ear No No No Yes No No No No No Yes Cupped ear is a physical malformation that is present at birth, indicating congenital onset. NA
47 Cupped ear No No No Yes No No No No No Yes Cupped ear is a physical malformation that is present at birth and does not develop later in life. Therefore, it has a congenital onset. NA
10 Exstrophy No No Yes Yes No No No No Yes Yes Exstrophy is a congenital birth defect that occurs during fetal development. NA
29 Exstrophy No No Yes Yes No No No No Yes Yes Exstrophy is a congenital abnormality, present at birth. NA
48 Exstrophy No No Yes Yes No No No No Yes Yes Exstrophy is a congenital condition that occurs during. NA

@bschilder @NathanSkene

@NathanSkene
Copy link

NathanSkene commented Mar 26, 2023 via email

@bschilder
Copy link
Contributor Author

bschilder commented Mar 26, 2023

Nice progress @KittyMurphy . That's interesting about the responses being more consistent when provided individually. Wondering if this has to with informational overload like we were discussing before. Might be an aspect of chatGPT that other people have noticed and documented.

One thing that would be helpful is to come up with a function that computes consistently scores for each metric. That will give us at least some quantitative metric of performance (tho not exactly the ground truth). Something like:

dat=xlsx::read.xlsx("~/Downloads/annot.xlsx",1)
avg <- dplyr::group_by(dat, phenotype) |> dplyr::summarise( mental.retardation_consistency=1/length(unique(mental.retardation)))
avg

Screenshot 2023-03-26 at 14 34 57

After computing the within phenotype consistency, you can compute mean consistency:

mean(avg$mental.retardation_consistency)
# 0.75

That prompt is not including the description of the phenotype is it?

@NathanSkene I believe this is only providing the chatGPT with the name of the phenotype, not the full description of it. Thus, any other information about the disease is being pulled from the LLM itself.

@NathanSkene
Copy link

NathanSkene commented Mar 26, 2023 via email

@KittyMurphy
Copy link
Contributor

Already working on adding the description, @bschilder I assume the best way to get this is to use the definition column in HPOExplorer::make_phenos_dataframe?

@bschilder
Copy link
Contributor Author

Already working on adding the description, @bschilder I assume the best way to get this is to use the definition column in HPOExplorer::make_phenos_dataframe?

Yeah, that'll work. Or the subfunction which is more direct:
HPOExplorer::add_hpo_definition()

@NathanSkene
Copy link

The current prompts do not include a statement for "Do not consider indirect effects". Would be worth adding this in and seeing if it makes any difference.

@bschilder
Copy link
Contributor Author

bschilder commented May 9, 2023

I tried out AutoGPT to see if this might be a useful avenue. Here’s what I learned:

Pros

  1. It can search the internet, via APIs or via Selenium queries. For example, if you ask it something it’s unsure about, it can read the relevant literature/databases on the topic to gain more expertise in that area.
  2. It has built-in python code for reading/writing code or other files. This means no need to copy-and-paste output from the browser interface. Using this feature I was able to tell it to read in a series of CSVs with 100 HPO terms each (that I had created beforehand) so that each query was a manageable size that didn’t exceed the token limit.
  3. There is a dedicated Docker container to run AutoGPT. The instructions are not super straightforward (or correct) but after some troubleshooting and checking the GitHub Issues i was able to get things working. I took notes on exactly how to do this and will share.

Cons

  1. As very few people have API access to GPT4 atm, it means that when we use AutoGPT we can only use the GPT3.5-turbo model. As you know, this is not as sophisticated of a model and will do thing like write lazy code that just assigns the same annotations to every phenotype, or simply do substring searches for the term “blindness” within the HPO term itself (which isn’t very useful).
  2. It requires you to have a paid OpenAI account. In the interest of time, I just entered my personal credit card details. It’s actually not too bad; after a whole day or making hundreds of queries I only racked up $1.46 in charges. But still something to do mindful of.
  3. It’s very tricky to get it to do what you actually want, and requires a lot of trial-and-error to get it close. This will hopefully be better with GPT4, but in the meantime i wasn’t able to get it to produce any kind of meaningful annotation for the HPO terms.

@bschilder
Copy link
Contributor Author

Here is my favorite example of how AutoGPT can be very lazy 😅
Screenshot 2023-05-05 at 22 27 30

@KittyMurphy
Copy link
Contributor

KittyMurphy commented May 19, 2023

I have now performed a trial run to annotate phenotypes using chat gpt via selenium. Initially we asked gpt to provide the output in .tsv format but I had difficulty trying to extract this from the chat interface into python. To overcome this, I asked gpt to provide the output as python code that I could then run to generate a data frame. @bschilder noted that earlier versions of gpt could sometimes be lazy when asking for code.

Here is a prompt example:
"I need to annotate phenotypes as to whether they typically cause: intellectual disability, death, impaired mobility, physical malformations, blindness, sensory impairments, immunodeficiency, cancer, reduced fertility? Do they have congenital onset? You must give one-word yes or no answers. Do not consider indirect effects. You must provide the output in python code as a data frame called df with columns: phenotype, intellectual_disability, death, impaired_mobility, physical_malformations, blindness, sensory_impairments, immunodeficiency, cancer, reduced_fertility, congenital_onset, justification. These are the phenotypes: Abnormality of body height; Multicystic kidney dysplasia; Autosomal dominant inheritance; Autosomal recessive inheritance; Abnormal morphology of female internal genitalia; Functional abnormality of the bladder; Recurrent urinary tract infections; Neurogenic bladder; Urinary urgency; Hypoplasia of the uterus; Abnormality of the bladder; Bladder diverticulum"

Here is the trial run using ~100 phenotypes (note, there are ~200 because I think I appended the results twice by mistake): annot_HPO_gpt_test.csv

@NathanSkene noted that the phenotype 'Azoospermia' is not being annotated as reducing fertility. This is worrying as upon a literature search of this phenotype:
"Azoospermia is the complete absence of spermatozoa in the ejaculate. It is the most severe and one of the leading causes of male infertility. The exact pathophysiology of azoospermia is not always known. Azoospermia can be due to pre-testicular, testicular, and post-testicular causes."

Next, I want to:

  • Run the prompt that included the 'Azoospermia' phenotype again, once asking gpt to provide the output as python code and once as a semi-colon separated list.
  • This time round, the python code output 'Azoospermia' as reducing fertility and the justification column had justifications and not just NAs.
  • I just realised that the prompt doesn't specify what the justification column should be for, maybe that's why it was outputting NAs previously.
  • Ask gpt to add a justification column for each phenotype, this might require including less phenotypes in the prompt so as to not overwhelm gpt with information (this seems to be an issue with earlier version of gpt)
  • Seems to work well with the justifications but the response generation was stopped prematurely, probably due to token usage.
  • Repeated using only 4 phenotypes (used 12 before), again seems to be working well e.g. the justification column for reduced fertility for Azoospermia: 'Azoospermia leads to male infertility', but the response generation was also stopped prematurely.

@bschilder
Copy link
Contributor Author

Thanks @KittyMurphy !

A couple of other ideas for reducing token usage (tho whether this helps will depend on how OpenAI counts 'tokens', which i'm still not totally clear on):

  • Using a persistent session and only defining the task in the first prompt. After that, just keep asking it to produce the same output each time. Hopefully this won't impact the quality of the outputs.
  • Ask to return "Y/N" instead of "Yes/No"
  • Ask chatGPT to abbreviate columns names (e.g. "Physical_Malformations"-->"PM")

@bschilder
Copy link
Contributor Author

bschilder commented May 19, 2023

Annotation output checks

All of the following annotation validation procedures described below can be rerun with any new annotations using the new internal function: HPOExplorer:::check_annot_gpt
https://github.com/neurogenomics/HPOExplorer/blob/master/R/check_annot_gpt.R

Check phenotype names

Check whether chatGPT hasn't modified the phenotype names such that we can't link it back to the input HPO terms.

  d <- data.table::fread(path, key = "Phenotype")
  annot <- HPOExplorer::load_phenotype_to_genes()
  d$Phenotype[!d$Phenotype %in% annot$Phenotype]
# character(0)

✅ All phenotypes in HPO gene annotations file verbatim.

Check annotation consistency

For phenotype that chatGPT annotated more than once, how consistent are the Y/N annotations it gave for each?

 nm <- names(d)[!names(d) %in% c("Phenotype","Justification")]
  d_mean <- d[,lapply(.SD,function(x){mean(x=="Yes")}),.SDcols=nm, by="Phenotype"]
  d_consist <- lapply(d_mean[,-1], function(x)sum(x%in%c(0,1)/nrow(d_mean)))
d_consist
$Intellectual_Disability
[1] 1

$Death
[1] 1

$Impaired_Mobility
[1] 1

$Physical_Malformations
[1] 1

$Blindness
[1] 1

$Sensory_Impairments
[1] 1

$Immunodeficiency
[1] 1

$Cancer
[1] 1

$Reduced_Fertility
[1] 0.7708333

$Congenital_Onset
[1] 1
mean(unlist(d_consist))
#  0.9770833

✅ At least In this small subsampling, 9/10 annotation columns are 100% consistent across chatGPT runs. This results in an average consistency score of 97.7% across all annotations. "Reduced_Fertility" is one to look out for, as it does not appear to always provide the same annotation here (77%, which may seem not too bad but remember that baseline is 50% as the options are binary).

Check phenotype classifications

As some of these phenotypes belong to specific branches of the HPO that should guarantee have a particular annotation (e.g. all forms of blindness phenotypes cause Blindness ('Yes'), we can use this information to validate the chatGPT-provided annotations.

While we can confirm annotations that we would expect (true positives vs. false negatives), this doesn't really let us definitively says whether some phenotypes do NOT cause a given condition such as blindness (true negatives).

d$HPO_ID <- harmonise_phenotypes(phenotypes = d$Phenotype,
                                   as_hpo_ids = TRUE)
  ## Find matching HPO branches
  hpo <- get_hpo() 
  queries <- list(
    Intellectual_Disability=c("intellectual disability"),
    Impaired_Mobility=c("Abnormal central motor function",
                        "Abnormality of movement"),
    Physical_Malformations=c("malformation","morphology"),
    Blindness=c("^blindness"),
    Sensory_Impairments=c("Abnormality of vision",
                          "Abnormality of the sense of smell",
                          "Abnormality of taste sensation",
                          "Somatic sensory dysfunction",
                          "Hearing abnormality"
                          ),
    Immunodeficiency=c("Immunodeficiency"),
    Cancer=c("Neoplasm","Cancer"),
    Reduced_Fertility=c("Decreased fertility")
    ) 
  tiers <- lapply(queries, function(q){
    terms <- grep(paste(q,collapse = "|"),
         hpo$name,
         ignore.case = TRUE, value = TRUE)
    ontologyIndex::get_descendants(ontology = hpo,
                                   roots = names(terms),
                                   exclude_roots = FALSE) |>
      unique()
  })
  annot_check <- lapply(seq_len(nrow(d)), function(i){
    r <- d[i,]
    cbind(
      r[,c("Phenotype","HPO_ID")],
      lapply(stats::setNames(names(tiers),names(tiers)),
             function(x){
               if(r$HPO_ID %in% tiers[[x]]){
                 r[,x,with=FALSE][[1]]=="Yes"
               } else {
                 NA
               }
             }) |> data.table::as.data.table()
    )
  }) |> data.table::rbindlist()
  
### Number of rows where annotation is NA
  missing_rate <- sapply(
    annot_check[,names(tiers),with=FALSE],
    function(x){sum(is.na(x))/length(x)})
missing_rate
Intellectual_Disability       Impaired_Mobility  Physical_Malformations 
              1.0000000               1.0000000               0.4558824 
              Blindness     Sensory_Impairments        Immunodeficiency 
              1.0000000               1.0000000               1.0000000 
                 Cancer       Reduced_Fertility 
              0.9901961               0.9607843 

True positive rate

### Number of rows where the annotation was checkable and TRUE
true_pos_rate <- sapply(annot_check[,names(tiers),with=FALSE], function(x){sum(na.omit(x)==TRUE)/length(na.omit(x))})
true_pos_rate 
Intellectual_Disability       Impaired_Mobility  Physical_Malformations 
                    NaN                     NaN               0.5765766 
              Blindness     Sensory_Impairments        Immunodeficiency 
                    NaN                     NaN                     NaN 
                 Cancer       Reduced_Fertility 
              1.0000000               0.5000000 

False negative rate

### Number of rows where the annotation was checkable and FALSE
false_neg_rate <- sapply(annot_check[,names(tiers),with=FALSE], function(x){sum(na.omit(x)==FALSE)/length(na.omit(x))})
false_neg_rate
Intellectual_Disability       Impaired_Mobility  Physical_Malformations 
                    NaN                     NaN               0.4234234 
              Blindness     Sensory_Impairments        Immunodeficiency 
                    NaN                     NaN                     NaN 
                 Cancer       Reduced_Fertility 
              0.0000000               0.5000000 

@KittyMurphy
Copy link
Contributor

I have since updated the prompt twice.

Example prompt 1.1: I need to annotate phenotypes as to whether they typically cause: intellectual disability, death, impaired mobility, physical malformations, blindness, sensory impairments, immunodeficiency, cancer, reduced fertility? Do they always have congenital onset? You must give one-word yes or no answers. Do not consider indirect effects. You must provide the output in python code as a data frame called df with columns: phenotype, intellectual_disability, death, impaired_mobility, physical_malformations, blindness, sensory_impairments, immunodeficiency, cancer, reduced_fertility, congenital_onset. Also add justification columns for each outcome. These are the phenotypes: Recurrent urinary tract infections; Neurogenic bladder; Urinary urgency

Here are the results for ~500 phenotypes: gpt_hpo_annotations.csv. The issue here was that we were getting non yes or no answers for some of the phenotypic outcomes e.g. 'can be', 'may be'. To get around this, we decided to add a scale for the phenotypic outcomes, so instead of yes or no answers we ask chat gpt to answer using a scale of: never, rarely, often, always. Due to limited token usage we had to drop the number of phenotypes in each prompt to two.

Example prompt 1.2: I need to annotate phenotypes as to whether they typically cause: intellectual disability, death, impaired mobility, physical malformations, blindness, sensory impairments, immunodeficiency, cancer, reduced fertility? Do they have congenital onset? To answer, use a severity scale of: never, rarely, often, always. Do not consider indirect effects. You must provide the output in python code as a data frame called df with columns: phenotype, intellectual_disability, death, impaired_mobility, physical_malformations, blindness, sensory_impairments, immunodeficiency, cancer, reduced_fertility, congenital_onset. Also add justification columns for each outcome. These are the phenotypes: Urinary urgency; Hypoplasia of the uterus

Here are the results so far: gpt_hpo_annotations_scale.csv

Currently waiting for help from Eugene to get this set up on a remote machine so that it can run 24/7, and it will probably take ~2 weeks.

@bschilder
Copy link
Contributor Author

@KittyMurphy I'm looking into some resources that might be helpful:

ChatGPT File uploader (google chrome extension)
https://chrome.google.com/webstore/detail/chatgpt-file-uploader-ext/becfinhbfclcgokjlobojlnldbfillpf/

Bing Chat: Microsoft's iteration of ChatGPT:
https://www.bing.com/

@bschilder
Copy link
Contributor Author

bschilder commented Nov 6, 2023

Update

Stage 1

  1. We first only ran GPT annotations for the 2,832 phenotypes that were significantly enriched for at least one cell type in our first round of analyses

Stage 2

  1. Then, we expanded to all 10,969 phenotypes that appeared within the HPO gene annotations file. This should be sufficient for the first Rare Disease Celltyping paper, as it allows us to prioritise all phenotypes relevant for that paper.
annot=HPOExplorer::load_phenotype_to_genes()
length(unique(annot$hpo_name))
# [1] 10969

@KittyMurphy is running the last of these now.

Stage 3

  1. Finally, we will further extend our GPT annotations to all phenotypes in the HPO, which is currently 18,057 total phenotypes. This will be used for the GPT annotations manuscript.
hpo=HPOExplorer::get_hpo()
> length(unique(hpo$name))
# [1] 18057

@KittyMurphy
Copy link
Contributor

I've actually been using the below code to get the phenotypes:

annot <- HPOExplorer::make_phenos_dataframe()
length(unique(annot$hpo_id))
[1] 10954

I'll make sure I run the remaining 15 phenotypes that are called with HPOExplorer::load_phenotype_to_genes() but just wanted to flag the discrepancy between the two.

@bschilder
Copy link
Contributor Author

@KittyMurphy make_phenos_dataframe calls load_phenotype_to_genes to get the data, so they should be the same (unless somehow certain phenotypes get filtered in the former function).
https://github.com/neurogenomics/HPOExplorer/blob/master/R/make_phenos_dataframe.R

Could you check whether this discrepancy stems from :

  1. the functions themselves
  2. Different versions of HPO ontology/genes data (note, data is cached by default).
  3. Different versions of HPOExplorer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants